retreval 0.1.1 → 0.1.2

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -21,15 +21,15 @@ If you want to see an example, use this command:
21
21
  INSTALLATION
22
22
  ============
23
23
 
24
- You can manually download the sources and build the Gem from there by `cd`ing to the folder where this README is saved and calling
24
+ If you have RubyGems, just run
25
25
 
26
- gem build retreval.gemspec
26
+ gem install retreval
27
27
 
28
- This will create a file called `retreval` which you just have to install:
28
+ You can manually download the sources and build the Gem from there by `cd`ing to the folder where this README is saved and calling
29
29
 
30
- gem install retreval
30
+ gem build retreval.gemspec
31
31
 
32
- And you're done.
32
+ This will create a gem file called which you just have to install with `gem install <file>` and you're done.
33
33
 
34
34
 
35
35
  HOWTO
@@ -182,7 +182,7 @@ module Retreval
182
182
  end
183
183
 
184
184
  # If we didn't find any judgements, just leave it as false
185
- if relevant_count == 0 and relevant_count == 0
185
+ if relevant_count == 0 and nonrelevant_count == 0
186
186
  false
187
187
  else
188
188
  relevant_count >= nonrelevant_count
@@ -3,7 +3,7 @@ Gem::Specification.new do |s|
3
3
  s.summary = "A Ruby API for Evaluating Retrieval Results"
4
4
  s.description = File.read(File.join(File.dirname(__FILE__), 'README.md'))
5
5
  # s.requirements = [ 'Nothing special' ]
6
- s.version = "0.1.1"
6
+ s.version = "0.1.2"
7
7
  s.author = "Werner Robitza"
8
8
  s.email = "werner.robitza@univie.ac.at"
9
9
  s.homepage = "http://github.com/slhck/retreval"
metadata CHANGED
@@ -1,348 +1,188 @@
1
- --- !ruby/object:Gem::Specification
1
+ --- !ruby/object:Gem::Specification
2
2
  name: retreval
3
- version: !ruby/object:Gem::Version
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.2
4
5
  prerelease:
5
- version: 0.1.1
6
6
  platform: ruby
7
- authors:
7
+ authors:
8
8
  - Werner Robitza
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
-
13
- date: 2011-04-06 00:00:00 Z
12
+ date: 2012-01-11 00:00:00.000000000Z
14
13
  dependencies: []
15
-
16
- description: |-
17
- README
18
- ======
19
-
20
- This is a simple API to evaluate information retrieval results. It allows you to load ranked and unranked query results and calculate various evaluation metrics (precision, recall, MAP, kappa) against a previously loaded gold standard.
21
-
22
- Start this program from the command line with:
23
-
24
- retreval -l <gold-standard-file> -q <query-results> -f <format> -o <output-prefix>
25
-
26
- The options are outlined when you pass no arguments and just call
27
-
28
- retreval
29
-
30
- You will find further information in the RDOC documentation and the HOWTO section below.
31
-
32
- If you want to see an example, use this command:
33
-
34
- retreval -l example/gold_standard.yml -q example/query_results.yml -f yaml -v
35
-
36
-
37
- INSTALLATION
38
- ============
39
-
40
- You can manually download the sources and build the Gem from there by `cd`ing to the folder where this README is saved and calling
41
-
42
- gem build retreval.gemspec
43
-
44
- This will create a file called `retreval` which you just have to install:
45
-
46
- gem install retreval
47
-
48
- And you're done.
49
-
50
-
51
- HOWTO
52
- =====
53
-
54
- This API supports the following evaluation tasks:
55
-
56
- - Loading a Gold Standard that takes a set of documents, queries and corresponding judgements of relevancy (i.e. "Is this document relevant for this query?")
57
- - Calculation of the _kappa measure_ for the given gold standard
58
-
59
- - Loading ranked or unranked query results for a certain query
60
- - Calculation of _precision_ and _recall_ for each result
61
- - Calculation of the _F-measure_ for weighing precision and recall
62
- - Calculation of _mean average precision_ for multiple query results
63
- - Calculation of the _11-point precision_ and _average precision_ for ranked query results
64
-
65
- - Printing of summary tables and results
66
-
67
- Typically, you will want to use this Gem either standalone or within another application's context.
68
-
69
- Standalone Usage
70
- ================
71
-
72
- Call parameters
73
- ---------------
74
-
75
- After installing the Gem (see INSTALLATION), you can always call `retreval` from the commandline. The typical call is:
76
-
77
- retreval -l <gold-standard-file> -q <query-results> -f <format> -o <output-prefix>
78
-
79
- Where you have to define the following options:
80
-
81
- - `gold-standard-file` is a file in a specified format that includes all the judgements
82
- - `query-results` is a file in a specified format that includes all the query results in a single file
83
- - `format` is the format that the files will use (either "yaml" or "plain")
84
- - `output-prefix` is the prefix of output files that will be created
85
-
86
- Formats
87
- -------
88
-
89
- Right now, we focus on the formats you can use to load data into the API. Currently, we support YAML files that must adhere to a special syntax. So, in order to load a gold standard, we need a file in the following format:
90
-
91
- * "query" denotes the query
92
- * "documents" these are the documents judged for this query
93
- * "id" the ID of the document (e.g. its filename, etc.)
94
- * "judgements" an array of judgements, each one with:
95
- * "relevant" a boolean value of the judgment (relevant or not)
96
- * "user" an optional identifier of the user
97
-
98
- Example file, with one query, two documents, and one judgement:
99
-
100
- - query: 12th air force germany 1957
101
- documents:
102
- - id: g5701s.ict21311
103
- judgements: []
104
-
105
- - id: g5701s.ict21313
106
- judgements:
107
- - relevant: false
108
- user: 2
109
-
110
- So, when calling the program, specify the format as `yaml`.
111
- For the query results, a similar format is used. Note that it is necessary to specify whether the result sets are ranked or not, as this will heavily influence the calculations. You can specify the score for a document. By "score" we mean the score that your retrieval algorithm has given the document. But this is not necessary. The documents will always be ranked in the order of their appearance, regardless of their score. Thus in the following example, the document with "07" at the end is the first and "25" is the last, regardless of the score.
112
-
113
- ---
114
- query: 12th air force germany 1957
115
- ranked: true
116
- documents:
117
- - score: 0.44034874
118
- document: g5701s.ict21307
119
- - score: 0.44034874
120
- document: g5701s.ict21309
121
- - score: 0.44034874
122
- document: g5701s.ict21311
123
- - score: 0.44034874
124
- document: g5701s.ict21313
125
- - score: 0.44034874
126
- document: g5701s.ict21315
127
- - score: 0.44034874
128
- document: g5701s.ict21317
129
- - score: 0.44034874
130
- document: g5701s.ict21319
131
- - score: 0.44034874
132
- document: g5701s.ict21321
133
- - score: 0.44034874
134
- document: g5701s.ict21323
135
- - score: 0.44034874
136
- document: g5701s.ict21325
137
- ---
138
- query: 1612
139
- ranked: true
140
- documents:
141
- - score: 1.0174774
142
- document: g3290.np000144
143
- - score: 0.763108
144
- document: g3201b.ct000726
145
- - score: 0.763108
146
- document: g3400.ct000886
147
- - score: 0.6359234
148
- document: g3201s.ct000130
149
- ---
150
-
151
- **Note**: You can also use the `plain` format, which will load the gold standard in a different way (but not the results):
152
-
153
- my_query my_document_1 false
154
- my_query my_document_2 true
155
-
156
- See that every query/document/relevancy pair is separated by a tabulator? You can also add the user's ID in the fourth column if necessary.
157
-
158
- Running the evaluation
159
- -----------------------
160
-
161
- After you have specified the input files and the format, you can run the program. If needed, the `-v` switch will turn on verbose messages, such as information on how many judgements, documents and users there are, but this shouldn't be necessary.
162
-
163
- The program will first load the gold standard and then calculate the statistics for each result set. The output files are automatically created and contain a YAML representation of the results.
164
-
165
- Calculations may take a while depending on the amount of judgements and documents. If there are a thousand judgements, always consider a few seconds for each result set.
166
-
167
- Interpreting the output files
168
- ------------------------------
169
-
170
- Two output files will be created:
171
-
172
- - `output_avg_precision.yml`
173
- - `output_statistics.yml`
174
-
175
- The first lists the average precision for each query in the query result file. The second file lists all supported statistics for each query in the query results file.
176
-
177
- For example, for a ranked evaluation, the first two entries of such a query result statistic look like this:
178
-
179
- ---
180
- 12th air force germany 1957:
181
- - :precision: 0.0
182
- :recall: 0.0
183
- :false_negatives: 1
184
- :false_positives: 1
185
- :true_negatives: 2516
186
- :true_positives: 0
187
- :document: g5701s.ict21313
188
- :relevant: false
189
- - :precision: 0.0
190
- :recall: 0.0
191
- :false_negatives: 1
192
- :false_positives: 2
193
- :true_negatives: 2515
194
- :true_positives: 0
195
- :document: g5701s.ict21317
196
- :relevant: false
197
-
198
- You can see the precision and recall for that specific point and also the number of documents for the contingency table (true/false positives/negatives). Also, the document identifier is given.
199
-
200
- API Usage
201
- =========
202
-
203
- Using this API in another ruby application is probably the more common use case. All you have to do is include the Gem in your Ruby or Ruby on Rails application. For details about available methods, please refer to the API documentation generated by RDoc.
204
-
205
- **Important**: For this implementation, we use the document ID, the query and the user ID as the primary keys for matching objects. This means that your documents and queries are identified by a string and thus the strings should be sanitized first.
206
-
207
- Loading the Gold Standard
208
- -------------------------
209
-
210
- Once you have loaded the Gem, you will probably start by creating a new gold standard.
211
-
212
- gold_standard = GoldStandard.new
213
-
214
- Then, you can load judgements into this standard, either from a file, or manually:
215
-
216
- gold_standard.load_from_yaml_file "my-file.yml"
217
- gold_standard.add_judgement :document => doc_id, :query => query_string, :relevant => boolean, :user => John
218
-
219
- There is a nice shortcut for the `add_judgement` method. Both lines are essentially the same:
220
-
221
- gold_standard.add_judgement :document => doc_id, :query => query_string, :relevant => boolean, :user => John
222
- gold_standard << :document => doc_id, :query => query_string, :relevant => boolean, :user => John
223
-
224
- Note the usage of typical Rails hashes for better readability (also, this Gem was developed to be used in a Rails webapp).
225
-
226
- Now that you have loaded the gold standard, you can do things like:
227
-
228
- gold_standard.contains_judgement? :document => "a document", :query => "the query"
229
- gold_standard.relevant? :document => "a document", :query => "the query"
230
-
231
-
232
- Loading the Query Results
233
- -------------------------
234
-
235
- Now we want to create a new `QueryResultSet`. A query result set can contain more than one result, which is what we normally want. It is important that you specify the gold standard it belongs to.
236
-
237
- query_result_set = QueryResultSet.new :gold_standard => gold_standard
238
-
239
- Just like the Gold Standard, you can read a query result set from a file:
240
-
241
- query_result_set.load_from_yaml_file "my-results-file.yml"
242
-
243
- Alternatively, you can load the query results one by one. To do this, you have to create the results (either ranked or unranked) and then add documents:
244
-
245
- my_result = RankedQueryResult.new :query => "the query"
246
- my_result.add_document :document => "test_document 1", :score => 13
247
- my_result.add_document :document => "test_document 2", :score => 11
248
- my_result.add_document :document => "test_document 3", :score => 3
249
-
250
- This result would be ranked, obviously, and contain three documents. Documents can have a score, but this is optional. You can also create an Array of documents first and add them altogether:
251
-
252
- documents = Array.new
253
- documents << ResultDocument.new :id => "test_document 1", :score => 20
254
- documents << ResultDocument.new :id => "test_document 2", :score => 21
255
- my_result = RankedQueryResult.new :query => "the query", :documents => documents
256
-
257
- The same applies to `UnrankedQueryResult`s, obviously. The order of ranked documents is the same as the order in which they were added to the result.
258
-
259
- The `QueryResultSet` will now contain all the results. They are stored in an array called `query_results`, which you can access. So, to iterate over each result, you might want to use the following code:
260
-
261
- query_result_set.query_results.each_with_index do |result, index|
262
- # ...
263
- end
264
-
265
- Or, more simply:
266
-
267
- for result in query_result_set.query_results
268
- # ...
269
- end
270
-
271
- Calculating statistics
272
- ----------------------
273
-
274
- Now to the interesting part: Calculating statistics. As mentioned before, there is a conceptual difference between ranked and unranked results. Unranked results are much easier to calculate and thus take less CPU time.
275
-
276
- No matter if unranked or ranked, you can get the most important statistics by just calling the `statistics` method.
277
-
278
- statistics = my_result.statistics
279
-
280
- In the simple case of an unranked result, you will receive a hash with the following information:
281
-
282
- * `precision` - the precision of the results
283
- * `recall` - the recall of the results
284
- * `false_negatives` - number of not retrieved but relevant items
285
- * `false_positives` - number of retrieved but nonrelevant
286
- * `true_negatives` - number of not retrieved and nonrelevantv items
287
- * `true_positives` - number of retrieved and relevant items
288
-
289
- In case of a ranked result, you will receive an Array that consists of _n_ such Hashes, depending on the number of documents. Each Hash will give you the information at a certain rank, e.g. the following to lines return the recall at the fourth rank.
290
-
291
- statistics = my_ranked_result.statistics
292
- statistics[3][:recall]
293
-
294
- In addition to the information mentioned above, you can also get for each rank:
295
-
296
- * `document` - the ID of the document that was returned at this rank
297
- * `relevant` - whether the document was relevant or not
298
-
299
- Calculating statistics with missing judgements
300
- ----------------------------------------------
301
-
302
- Sometimes, you don't have judgements for all document/query pairs in the gold standard. If this happens, the results will be cleaned up first. This means that every document in the results that doesn't appear to have a judgement will be removed temporarily.
303
-
304
- As an example, take the following results:
305
-
306
- * A
307
- * B
308
- * C
309
- * D
310
-
311
- Our gold standard only contains judgements for A and C. The results will be cleaned up first, thus leading to:
312
-
313
- * A
314
- * C
315
-
316
- With this approach, we can still provide meaningful results (for precision and recall).
317
-
318
- Other statistics
319
- ----------------
320
-
321
- There are several other statistics that can be calculated, for example the **F measure**. The F measure weighs precision and recall and has one parameter, either "alpha" or "beta". Get the F measure like so:
322
-
323
- my_result.f_measure :beta => 1
324
-
325
- If you don't specify either alpha or beta, we will assume that beta = 1.
326
-
327
- Another interesting measure is **Cohen's Kappa**, which tells us about the inter-agreement of assessors. Get the kappa statistic like this:
328
-
329
- gold_standard.kappa
330
-
331
- This will calculate the average kappa for each pairwise combination of users in the gold standard.
332
-
333
- For ranked results one might also want to calculate an **11-point precision**. Just call the following:
334
-
335
- my_ranked_result.eleven_point_precision
336
-
337
- This will return a Hash that has indices at the 11 recall levels from 0 to 1 (with steps of 0.1) and the corresponding precision at that recall level.
14
+ description: ! "README\n======\n\nThis is a simple API to evaluate information retrieval
15
+ results. It allows you to load ranked and unranked query results and calculate various
16
+ evaluation metrics (precision, recall, MAP, kappa) against a previously loaded gold
17
+ standard.\n\nStart this program from the command line with:\n\n retreval -l <gold-standard-file>
18
+ -q <query-results> -f <format> -o <output-prefix>\n\nThe options are outlined when
19
+ you pass no arguments and just call\n\n retreval\n\nYou will find further information
20
+ in the RDOC documentation and the HOWTO section below.\n\nIf you want to see an
21
+ example, use this command:\n\n retreval -l example/gold_standard.yml -q example/query_results.yml
22
+ -f yaml -v\n\n\nINSTALLATION\n============\n\nIf you have RubyGems, just run\n\n
23
+ \ gem install retreval\n\nYou can manually download the sources and build the
24
+ Gem from there by `cd`ing to the folder where this README is saved and calling\n\n
25
+ \ gem build retreval.gemspec\n\nThis will create a gem file called which you just
26
+ have to install with `gem install <file>` and you're done.\n\n\nHOWTO\n=====\n\nThis
27
+ API supports the following evaluation tasks:\n\n- Loading a Gold Standard that takes
28
+ a set of documents, queries and corresponding judgements of relevancy (i.e. \"Is
29
+ this document relevant for this query?\")\n- Calculation of the _kappa measure_
30
+ for the given gold standard\n\n- Loading ranked or unranked query results for a
31
+ certain query\n- Calculation of _precision_ and _recall_ for each result\n- Calculation
32
+ of the _F-measure_ for weighing precision and recall\n- Calculation of _mean average
33
+ precision_ for multiple query results\n- Calculation of the _11-point precision_
34
+ and _average precision_ for ranked query results\n\n- Printing of summary tables
35
+ and results\n\nTypically, you will want to use this Gem either standalone or within
36
+ another application's context.\n\nStandalone Usage\n================\n\nCall parameters\n---------------\n\nAfter
37
+ installing the Gem (see INSTALLATION), you can always call `retreval` from the commandline.
38
+ The typical call is:\n\n retreval -l <gold-standard-file> -q <query-results>
39
+ -f <format> -o <output-prefix>\n\nWhere you have to define the following options:\n\n-
40
+ `gold-standard-file` is a file in a specified format that includes all the judgements\n-
41
+ `query-results` is a file in a specified format that includes all the query results
42
+ in a single file\n- `format` is the format that the files will use (either \"yaml\"
43
+ or \"plain\")\n- `output-prefix` is the prefix of output files that will be created\n\nFormats\n-------\n\nRight
44
+ now, we focus on the formats you can use to load data into the API. Currently, we
45
+ support YAML files that must adhere to a special syntax. So, in order to load a
46
+ gold standard, we need a file in the following format:\n\n * \"query\" denotes
47
+ the query\n * \"documents\" these are the documents judged for this query\n *
48
+ \"id\" the ID of the document (e.g. its filename, etc.)\n * \"judgements\"
49
+ \ an array of judgements, each one with:\n * \"relevant\" a boolean value of
50
+ the judgment (relevant or not)\n * \"user\" an optional identifier of the
51
+ user\n\nExample file, with one query, two documents, and one judgement:\n\n -
52
+ query: 12th air force germany 1957\n documents:\n - id: g5701s.ict21311\n
53
+ \ judgements: []\n\n - id: g5701s.ict21313\n judgements:
54
+ \n - relevant: false\n user: 2\n\nSo, when calling the program,
55
+ specify the format as `yaml`.\nFor the query results, a similar format is used.
56
+ Note that it is necessary to specify whether the result sets are ranked or not,
57
+ as this will heavily influence the calculations. You can specify the score for a
58
+ document. By \"score\" we mean the score that your retrieval algorithm has given
59
+ the document. But this is not necessary. The documents will always be ranked in
60
+ the order of their appearance, regardless of their score. Thus in the following
61
+ example, the document with \"07\" at the end is the first and \"25\" is the last,
62
+ regardless of the score.\n\n ---\n query: 12th air force germany 1957\n
63
+ \ ranked: true\n documents:\n - score: 0.44034874\n document:
64
+ g5701s.ict21307\n - score: 0.44034874\n document: g5701s.ict21309\n
65
+ \ - score: 0.44034874\n document: g5701s.ict21311\n -
66
+ \ score: 0.44034874\n document: g5701s.ict21313\n - score:
67
+ 0.44034874\n document: g5701s.ict21315\n - score: 0.44034874\n
68
+ \ document: g5701s.ict21317\n - score: 0.44034874\n document:
69
+ g5701s.ict21319\n - score: 0.44034874\n document: g5701s.ict21321\n
70
+ \ - score: 0.44034874\n document: g5701s.ict21323\n -
71
+ \ score: 0.44034874\n document: g5701s.ict21325\n ---\n query:
72
+ 1612\n ranked: true\n documents:\n - score: 1.0174774\n document:
73
+ g3290.np000144\n - score: 0.763108\n document: g3201b.ct000726\n
74
+ \ - score: 0.763108\n document: g3400.ct000886\n - score:
75
+ 0.6359234\n document: g3201s.ct000130\n ---\n\n**Note**: You can
76
+ also use the `plain` format, which will load the gold standard in a different way
77
+ (but not the results):\n\n my_query my_document_1 false\n my_query
78
+ \ my_document_2 true\n\nSee that every query/document/relevancy pair is
79
+ separated by a tabulator? You can also add the user's ID in the fourth column if
80
+ necessary.\n\nRunning the evaluation\n-----------------------\n\nAfter you have
81
+ specified the input files and the format, you can run the program. If needed, the
82
+ `-v` switch will turn on verbose messages, such as information on how many judgements,
83
+ documents and users there are, but this shouldn't be necessary.\n\nThe program will
84
+ first load the gold standard and then calculate the statistics for each result set.
85
+ The output files are automatically created and contain a YAML representation of
86
+ the results.\n\nCalculations may take a while depending on the amount of judgements
87
+ and documents. If there are a thousand judgements, always consider a few seconds
88
+ for each result set.\n\nInterpreting the output files\n------------------------------\n\nTwo
89
+ output files will be created:\n\n- `output_avg_precision.yml`\n- `output_statistics.yml`\n\nThe
90
+ first lists the average precision for each query in the query result file. The second
91
+ file lists all supported statistics for each query in the query results file.\n\nFor
92
+ example, for a ranked evaluation, the first two entries of such a query result statistic
93
+ look like this:\n\n --- \n 12th air force germany 1957: \n -
94
+ :precision: 0.0\n :recall: 0.0\n :false_negatives: 1\n :false_positives:
95
+ 1\n :true_negatives: 2516\n :true_positives: 0\n :document:
96
+ g5701s.ict21313\n :relevant: false\n - :precision: 0.0\n :recall:
97
+ 0.0\n :false_negatives: 1\n :false_positives: 2\n :true_negatives:
98
+ 2515\n :true_positives: 0\n :document: g5701s.ict21317\n :relevant:
99
+ false\n\nYou can see the precision and recall for that specific point and also the
100
+ number of documents for the contingency table (true/false positives/negatives).
101
+ Also, the document identifier is given.\n\nAPI Usage\n=========\n\nUsing this API
102
+ in another ruby application is probably the more common use case. All you have to
103
+ do is include the Gem in your Ruby or Ruby on Rails application. For details about
104
+ available methods, please refer to the API documentation generated by RDoc.\n\n**Important**:
105
+ For this implementation, we use the document ID, the query and the user ID as the
106
+ primary keys for matching objects. This means that your documents and queries are
107
+ identified by a string and thus the strings should be sanitized first.\n\nLoading
108
+ the Gold Standard\n-------------------------\n\nOnce you have loaded the Gem, you
109
+ will probably start by creating a new gold standard.\n\n gold_standard = GoldStandard.new\n\nThen,
110
+ you can load judgements into this standard, either from a file, or manually:\n\n
111
+ \ gold_standard.load_from_yaml_file \"my-file.yml\"\n gold_standard.add_judgement
112
+ :document => doc_id, :query => query_string, :relevant => boolean, :user => John\n\nThere
113
+ is a nice shortcut for the `add_judgement` method. Both lines are essentially the
114
+ same:\n\n gold_standard.add_judgement :document => doc_id, :query => query_string,
115
+ :relevant => boolean, :user => John\n gold_standard << :document => doc_id, :query
116
+ => query_string, :relevant => boolean, :user => John\n\nNote the usage of typical
117
+ Rails hashes for better readability (also, this Gem was developed to be used in
118
+ a Rails webapp).\n\nNow that you have loaded the gold standard, you can do things
119
+ like:\n\n gold_standard.contains_judgement? :document => \"a document\",
120
+ :query => \"the query\"\n gold_standard.relevant? :document => \"a document\",
121
+ :query => \"the query\"\n\n\nLoading the Query Results\n-------------------------\n\nNow
122
+ we want to create a new `QueryResultSet`. A query result set can contain more than
123
+ one result, which is what we normally want. It is important that you specify the
124
+ gold standard it belongs to.\n\n query_result_set = QueryResultSet.new :gold_standard
125
+ => gold_standard\n\nJust like the Gold Standard, you can read a query result set
126
+ from a file:\n\n query_result_set.load_from_yaml_file \"my-results-file.yml\"\n\nAlternatively,
127
+ you can load the query results one by one. To do this, you have to create the results
128
+ (either ranked or unranked) and then add documents:\n\n my_result = RankedQueryResult.new
129
+ :query => \"the query\"\n my_result.add_document :document => \"test_document
130
+ 1\", :score => 13\n my_result.add_document :document => \"test_document 2\",
131
+ :score => 11\n my_result.add_document :document => \"test_document 3\", :score
132
+ => 3\n\nThis result would be ranked, obviously, and contain three documents. Documents
133
+ can have a score, but this is optional. You can also create an Array of documents
134
+ first and add them altogether:\n\n documents = Array.new\n documents
135
+ << ResultDocument.new :id => \"test_document 1\", :score => 20\n documents
136
+ << ResultDocument.new :id => \"test_document 2\", :score => 21\n my_result
137
+ = RankedQueryResult.new :query => \"the query\", :documents => documents\n\nThe
138
+ same applies to `UnrankedQueryResult`s, obviously. The order of ranked documents
139
+ is the same as the order in which they were added to the result.\n\nThe `QueryResultSet`
140
+ will now contain all the results. They are stored in an array called `query_results`,
141
+ which you can access. So, to iterate over each result, you might want to use the
142
+ following code:\n\n query_result_set.query_results.each_with_index do |result,
143
+ index|\n # ...\n end\n\nOr, more simply:\n\n for result in
144
+ query_result_set.query_results\n # ...\n end\n\nCalculating statistics\n----------------------\n\nNow
145
+ to the interesting part: Calculating statistics. As mentioned before, there is a
146
+ conceptual difference between ranked and unranked results. Unranked results are
147
+ much easier to calculate and thus take less CPU time.\n\nNo matter if unranked or
148
+ ranked, you can get the most important statistics by just calling the `statistics`
149
+ method.\n\n statistics = my_result.statistics\n\nIn the simple case of an
150
+ unranked result, you will receive a hash with the following information:\n\n* `precision`
151
+ - the precision of the results\n* `recall` - the recall of the results\n* `false_negatives`
152
+ - number of not retrieved but relevant items\n* `false_positives` - number of retrieved
153
+ but nonrelevant\n* `true_negatives` - number of not retrieved and nonrelevantv items\n*
154
+ `true_positives` - number of retrieved and relevant items\n\nIn case of a ranked
155
+ result, you will receive an Array that consists of _n_ such Hashes, depending on
156
+ the number of documents. Each Hash will give you the information at a certain rank,
157
+ e.g. the following to lines return the recall at the fourth rank. \n\n statistics
158
+ = my_ranked_result.statistics\n statistics[3][:recall]\n\nIn addition to
159
+ the information mentioned above, you can also get for each rank:\n\n* `document`
160
+ - the ID of the document that was returned at this rank\n* `relevant` - whether
161
+ the document was relevant or not\n\nCalculating statistics with missing judgements\n----------------------------------------------\n\nSometimes,
162
+ you don't have judgements for all document/query pairs in the gold standard. If
163
+ this happens, the results will be cleaned up first. This means that every document
164
+ in the results that doesn't appear to have a judgement will be removed temporarily.\n\nAs
165
+ an example, take the following results:\n\n* A\n* B\n* C\n* D\n\nOur gold standard
166
+ only contains judgements for A and C. The results will be cleaned up first, thus
167
+ leading to:\n\n* A\n* C\n\nWith this approach, we can still provide meaningful results
168
+ (for precision and recall).\n\nOther statistics\n----------------\n\nThere are several
169
+ other statistics that can be calculated, for example the **F measure**. The F measure
170
+ weighs precision and recall and has one parameter, either \"alpha\" or \"beta\".
171
+ Get the F measure like so:\n\n my_result.f_measure :beta => 1\n\nIf you don't
172
+ specify either alpha or beta, we will assume that beta = 1.\n\nAnother interesting
173
+ measure is **Cohen's Kappa**, which tells us about the inter-agreement of assessors.
174
+ Get the kappa statistic like this:\n\n gold_standard.kappa\n\nThis will calculate
175
+ the average kappa for each pairwise combination of users in the gold standard.\n\nFor
176
+ ranked results one might also want to calculate an **11-point precision**. Just
177
+ call the following:\n\n my_ranked_result.eleven_point_precision\n\nThis will
178
+ return a Hash that has indices at the 11 recall levels from 0 to 1 (with steps of
179
+ 0.1) and the corresponding precision at that recall level."
338
180
  email: werner.robitza@univie.ac.at
339
- executables:
181
+ executables:
340
182
  - retreval
341
183
  extensions: []
342
-
343
184
  extra_rdoc_files: []
344
-
345
- files:
185
+ files:
346
186
  - bin/retreval
347
187
  - CHANGELOG
348
188
  - doc/bin/retreval.html
@@ -387,31 +227,28 @@ files:
387
227
  - TODO
388
228
  homepage: http://github.com/slhck/retreval
389
229
  licenses: []
390
-
391
230
  post_install_message:
392
231
  rdoc_options: []
393
-
394
- require_paths:
232
+ require_paths:
395
233
  - lib
396
- required_ruby_version: !ruby/object:Gem::Requirement
234
+ required_ruby_version: !ruby/object:Gem::Requirement
397
235
  none: false
398
- requirements:
399
- - - ">="
400
- - !ruby/object:Gem::Version
401
- version: "1.9"
402
- required_rubygems_version: !ruby/object:Gem::Requirement
236
+ requirements:
237
+ - - ! '>='
238
+ - !ruby/object:Gem::Version
239
+ version: '1.9'
240
+ required_rubygems_version: !ruby/object:Gem::Requirement
403
241
  none: false
404
- requirements:
405
- - - ">="
406
- - !ruby/object:Gem::Version
407
- version: "0"
242
+ requirements:
243
+ - - ! '>='
244
+ - !ruby/object:Gem::Version
245
+ version: '0'
408
246
  requirements: []
409
-
410
247
  rubyforge_project:
411
- rubygems_version: 1.7.2
248
+ rubygems_version: 1.8.10
412
249
  signing_key:
413
250
  specification_version: 3
414
251
  summary: A Ruby API for Evaluating Retrieval Results
415
- test_files:
252
+ test_files:
416
253
  - test/test_gold_standard.rb
417
254
  - test/test_query_result.rb