rroonga 5.0.1 → 5.0.2

Sign up to get free protection for your applications and to get access to all the features.
@@ -1,145 +0,0 @@
1
- h1. Install
2
-
3
- This document describes how to install Rroonga.
4
-
5
- You can install Rroonga by RubyGems. It is the standard way for Ruby
6
- libraries.
7
-
8
- Rroonga is depends on Groonga. So you need to install both Groonga and
9
- Rroonga. You can't install Groonga by RubyGems because it isn't Ruby
10
- library. But don't worry. Rroonga provides the following options for
11
- easy to install:
12
-
13
- * Rroonga downloads, builds and installs Groonga automatically. You
14
- don't need to do them explicitly.
15
- * Rroonga uses Groonga installed by your packaging system.
16
-
17
- The following sections describe the above in detail.
18
-
19
- h2. Install with auto Groonga build
20
-
21
- Rroonga searches Groonga on install. If Rroonga can't find
22
- Groonga, Rroonga downloads, builds and installs Groonga
23
- automatically.
24
-
25
- Type the following command to install Rroonga and Groonga. You don't
26
- need to install Groonga explicitly:
27
-
28
- <pre>
29
- !!!command_line
30
- % gem install rroonga
31
- </pre>
32
-
33
- h2. Install with Groonga package
34
-
35
- You can use Groonga package on you packaging system instead of
36
- building Groonga by yourself. There are the following advantages for
37
- this option:
38
-
39
- * It reduces installation time.
40
- * It doesn't fail on building Groonga.
41
-
42
- h3. Windows
43
-
44
- Rroonga gem for Windows includes both pre-compiled Rroonga and Groonga
45
- in the gem. So what you need to do is you just install rroonga gem.
46
-
47
- Type the following command on Ruby console:
48
-
49
- <pre>
50
- !!!command_line
51
- > gem install rroonga
52
- </pre>
53
-
54
- This document assumes that you're using "RubyInstaller for
55
- Windows":http://rubyinstaller.org/ .
56
-
57
- h3. OS X
58
-
59
- There are Groonga packages for OS X environment.
60
-
61
- h4. MacPorts
62
-
63
- If you're using "MacPorts":http://www.macports.org/ , type the
64
- following commands on your terminal:
65
-
66
- <pre>
67
- !!!command_line
68
- % sudo port install groonga
69
- % sudo gem install rroonga
70
- </pre>
71
-
72
- h4. Homebrew
73
-
74
- If you're using "Homebrew":http://brew.sh/ , type the
75
- following commands on your terminal:
76
-
77
- <pre>
78
- !!!command_line
79
- % brew install groonga
80
- % gem install rroonga
81
- </pre>
82
-
83
- h3. Debian GNU/Linux
84
-
85
- You can install the Groonga package by apt. See "Groonga
86
- documentation":http://groonga.org/docs/install/debian.html how to set
87
- apt-line up.
88
-
89
- Type the following commands on your terminal after you finish to set
90
- apt-line up.
91
-
92
- <pre>
93
- !!!command_line
94
- % sudo apt-get install -y libgroonga-dev
95
- % sudo gem install rroonga
96
- </pre>
97
-
98
- h3. Ubuntu
99
-
100
- You can install the Groonga package by apt. See "Groonga
101
- documentation":http://groonga.org/docs/install/ubuntu.html how to set
102
- apt-line up.
103
-
104
- Type the following commands on your terminal after you finish to set
105
- apt-line up.
106
-
107
- <pre>
108
- !!!command_line
109
- % sudo apt-get install -y libgroonga-dev
110
- % sudo gem install rroonga
111
- </pre>
112
-
113
- h3. CentOS
114
-
115
- You can install the Groonga package by yum. See "Groonga
116
- documentation":http://groonga.org/docs/install/centos.html how to set
117
- yum repository up.
118
-
119
- But you need to install Ruby 1.9.3 or later by yourself. Both CentOS 5
120
- and 6 ship Ruby 1.8. Rroonga doesn't support Ruby 1.8.
121
-
122
- Type the following commands on your terminal after you finish to set
123
- yum repository up and installing Ruby 1.9.3 or later.
124
-
125
- <pre>
126
- !!!command_line
127
- % sudo yum install groonga-devel -y
128
- % gem install rroonga
129
- </pre>
130
-
131
- h3. Fedora
132
-
133
- You can install the Groonga package by yum. The Groonga package is
134
- included in the official Fedora repository.
135
-
136
- <pre>
137
- !!!command_line
138
- % sudo yum install groonga-devel -y
139
- % sudo gem install rroonga
140
- </pre>
141
-
142
- h2. Links
143
-
144
- * "2. Install - Groonga documentation":http://groonga.org/docs/install.html
145
-
@@ -1,510 +0,0 @@
1
- h1. Tutorial
2
-
3
- This page introduce how to use Rroonga via a simple application making.
4
-
5
- h2. Install
6
-
7
- You can install Rroonga in your compter with RubyGems.
8
-
9
- <pre>
10
- !!!command_line
11
- % sudo gem install rroonga
12
- </pre>
13
-
14
- h2. Create Database
15
-
16
- Let's create database for simple bookmark application.
17
- Please execute irb with loading Rroonga with this command:
18
-
19
- <pre>
20
- !!!irb
21
- % irb --simple-prompt -r groonga
22
- >>
23
- </pre>
24
-
25
- Now you use UTF-8 as the encoding of database.
26
-
27
- <pre>
28
- !!!irb
29
- >> Groonga::Context.default_options = {:encoding => :utf8}
30
- => {:encoding=>:utf8}
31
- </pre>
32
-
33
- Then, try to create database in a file.
34
-
35
- <pre>
36
- !!!irb
37
- >> Groonga::Database.create(:path => "/tmp/bookmark.db")
38
- => #<Groonga::Database ...>
39
- </pre>
40
-
41
- From now, the created database is used implicitly.
42
- You don't have to be aware of it after you created a database first.
43
-
44
- h2. Define table
45
-
46
- Groonga supports 4 types of tables.
47
-
48
- - Groonga::Hash :=
49
- Hash table. It manages records via each primary key. It supports
50
- very quickly exact match search.
51
- =:
52
-
53
- - Groonga::PatriciaTrie :=
54
- Patricia Trie. It supports some search such as predictive search and
55
- common prefix search, but it provides a little slowly exact match search
56
- than Groonga::Hash. It provides cursor to take records in ascending
57
- or descending order.
58
- =:
59
-
60
- - Groonga::DoubleArrayTrie :=
61
- Double Array Trie. It requires large spaces rather than other
62
- tables, but it can update key without ID change. It provides exract
63
- match search, predictive search and common prefix search and cursor
64
- like Groonga::PatriciaTrie.
65
- =:
66
-
67
- - Groonga::Array :=
68
- Array. It doesn't have primary keys. It manages records by ID.
69
- =:
70
-
71
- Now, you use Groonga::Hash and create the table named @Items@. The type
72
- of its primary key is String.
73
-
74
- <pre>
75
- !!!irb
76
- >> Groonga::Schema.create_table("Items", :type => :hash)
77
- => [...]
78
- </pre>
79
-
80
- You have @Items@ table by this code.
81
- You can refer the defined table with Groonga.[] like below:
82
-
83
- <pre>
84
- !!!irb
85
- >> items = Groonga["Items"]
86
- => #<Groonga::Hash ...>
87
- </pre>
88
-
89
- You can treat it like Hash.
90
- For example, let's type @items.size@ to get the number of records in
91
- the table.
92
-
93
- <pre>
94
- !!!irb
95
- >> items.size
96
- => 0
97
- </pre>
98
-
99
- h2. Add records
100
-
101
- Let's add records to @Items@ table.
102
-
103
- <pre>
104
- !!!irb
105
- >> items.add("http://en.wikipedia.org/wiki/Ruby")
106
- => #<Groonga::Record ...>
107
- >> items.add("http://www.ruby-lang.org/")
108
- => #<Groonga::Record ...>
109
- </pre>
110
-
111
- Please check the number of records. It increases from 0 to 2.
112
-
113
- <pre>
114
- !!!irb
115
- >> items.size
116
- => 2
117
- </pre>
118
-
119
- If you can get record by primary key, type like below:
120
-
121
- <pre>
122
- !!!irb
123
- >> items["http://en.wikipedia.org/wiki/Ruby"]
124
- => #<Groonga::Record ...>
125
- </pre>
126
-
127
- h2. Full text search
128
-
129
- Let's add item's title to full text search.
130
-
131
- first, you add the @Text@ type column "@title@" to @Items@ table.
132
-
133
- <pre>
134
- !!!irb
135
- >> Groonga::Schema.change_table("Items") do |table|
136
- ?> table.text("title")
137
- >> end
138
- => [...]
139
- </pre>
140
-
141
- Defined columns is named as @#{TABLE_NAME}.#{COLUMN_NAME}@.
142
- You can refer them with {Groonga.[]} as same as tables.
143
-
144
- <pre>
145
- !!!irb
146
- >> title_column = Groonga["Items.title"]
147
- => #<Groonga::VariableSizeColumn ...>
148
- </pre>
149
-
150
-
151
- Secondly, let's add the table containing terms from splited from texts.
152
- Then you define the @Terms@ for it.
153
-
154
- <pre>
155
- !!!irb
156
- >> Groonga::Schema.create_table("Terms",
157
- ?> :type => :patricia_trie,
158
- ?> :normalizer => :NormalizerAuto,
159
- ?> :default_tokenizer => "TokenBigram")
160
- </pre>
161
-
162
- You specify @:default_tokenzier => "TokenBigram"@ for "Tokenizer" in
163
- the above code.
164
- "Tokenizer" is the object to split terms from texts. The default value
165
- for it is none.
166
- Full text search requires a tokenizer, so you specify "Bigram", a type
167
- of N-gram.
168
- Full text search with N-gram uses splited N characters and their
169
- position in texts. "N" in N-gram specifies the number of each terms.
170
- Groonga supports Unigram (N=1), Bigram (N=2) and Trigram (N=3).
171
-
172
- You also specify @:normalizer => :NormalizerAuto@ to search texts with
173
- ignoring the case.
174
-
175
- Now, you ready table for terms, so you define the index of
176
- @Items.tiltle@ column.
177
-
178
- <pre>
179
- !!!irb
180
- >> Groonga::Schema.change_table("Terms") do |table|
181
- ?> table.index("Items.title")
182
- >> end
183
- => [...]
184
- </pre>
185
-
186
- You may feel a few unreasonable code. The index of @Items@ table's
187
- column is defined as the column in @Terms@.
188
-
189
- When a record is added to @Items@, groonga adds records associated
190
- each terms in it to @Terms@ automatically.
191
-
192
-
193
- @Terms@ is a few particular table, but you can add some columns to term
194
- table such as @Terms@ and manage many attributes of each terms. It is
195
- very useful to process particular search.
196
-
197
- Now, you finished table definition.
198
- Let's put some values to @title@ of each record you added before.
199
-
200
- <pre>
201
- !!!irb
202
- >> items["http://en.wikipedia.org/wiki/Ruby"].title = "Ruby"
203
- => "Ruby"
204
- >> items["http://www.ruby-lang.org/"].title = "Ruby Programming Language"
205
- "Ruby Programming Language"
206
- </pre>
207
-
208
- Now, you can do full text search like above:
209
-
210
- <pre>
211
- !!!irb
212
- >> ruby_items = items.select {|record| record.title =~ "Ruby"}
213
- => #<Groonga::Hash ..., normalizer: (nil)>
214
- </pre>
215
-
216
- Groonga returns the search result as Groonga::Hash.
217
- Keys in this hash table is records of hitted @Items@.
218
-
219
- <pre>
220
- !!!irb
221
- >> ruby_items.collect {|record| record.key.key}
222
- => ["http://en.wikipedia.org/wiki/Ruby", "http://www.ruby-lang.org/"]
223
- </pre>
224
-
225
- In above example, you get records in @Items@ with @record.key@, and
226
- keys of them with @record.key.key@.
227
-
228
- You can access a refered key in records briefly with @record["_key"]@.
229
-
230
- <pre>
231
- !!!irb
232
- >> ruby_items.collect {|record| record["_key"]}
233
- => ["http://en.wikipedia.org/wiki/Ruby", "http://www.ruby-lang.org/"]
234
- </pre>
235
-
236
- h2. Improve the simple bookmark application
237
-
238
- Let's try to improve this simple application a little. You can create
239
- bookmark application for multi users and they can comment to each
240
- bookmarks.
241
-
242
- First, you add tables for users and for comments like below:
243
-
244
- !http://qwik.jp/senna/senna2.files/rect4605.png!
245
-
246
- Let's add the table for users, @Users@.
247
-
248
- <pre>
249
- !!!irb
250
- >> Groonga::Schema.create_table("Users", :type => :hash) do |table|
251
- ?> table.text("name")
252
- >> end
253
- => [...]
254
- </pre>
255
-
256
-
257
- Next, let's add the table for comments as @Comments@.
258
-
259
- <pre>
260
- !!!irb
261
- >> Groonga::Schema.create_table("Comments") do |table|
262
- ?> table.reference("item")
263
- >> table.reference("author", "Users")
264
- >> table.text("content")
265
- >> table.time("issued")
266
- >> end
267
- => [...]
268
- </pre>
269
-
270
- Then you define the index of @content@ column in @Comments@ for full
271
- text search.
272
-
273
- <pre>
274
- !!!irb
275
- >> Groonga::Schema.change_table("Terms") do |table|
276
- ?> table.index("Comments.content")
277
- >> end
278
- => [...]
279
- </pre>
280
-
281
- You finish table definition by above code.
282
-
283
- Secondly, you add some users to @Users@.
284
-
285
- <pre>
286
- !!!irb
287
- >> users = Groonga["Users"]
288
- => #<Groonga::Hash ...>
289
- >> users.add("alice", :name => "Alice")
290
- => #<Groonga::Record ...>
291
- >> users.add("bob", :name => "Bob")
292
- => #<Groonga::Record ...>
293
- </pre>
294
-
295
- Now, let's write the process to bookmark by a user.
296
- You assume that the user, @moritan@, bookmark a page including
297
- infomation related Ruby.
298
-
299
- First, you check if the page has been added @Items@ already.
300
-
301
- <pre>
302
- !!!irb
303
- >> items.has_key?("http://www.ruby-doc.org/")
304
- => false
305
- </pre>
306
-
307
- The page hasn't been added, so you add it to @Items@.
308
-
309
- <pre>
310
- !!!irb
311
- >> items.add("http://www.ruby-doc.org/",
312
- ?> :title => "Ruby-Doc.org: Documenting the Ruby Language")
313
- => #<Groonga::Record ...>
314
- </pre>
315
-
316
- Next, you add the record to @Comments@. This record contains this page
317
- as its @item@ column.
318
-
319
- <pre>
320
- !!!irb
321
- >> require "time"
322
- => true
323
- >> comments = Groonga["Comments"]
324
- => #<Groonga::Array ...>
325
- >> comments.add(:item => "http://www.ruby-doc.org/",
326
- ?> :author => "alice",
327
- ?> :content => "Ruby documents",
328
- ?> :issued => Time.parse("2010-11-20T18:01:22+09:00"))
329
- => #<Groonga::Record ...>
330
- </pre>
331
-
332
- h2. Define methods for this process
333
-
334
- For usefull, you define methods for above processes.
335
-
336
- <pre>
337
- !!!irb
338
- >> @items = items
339
- => #<Groonga::Hash ...>
340
- >> @comments = comments
341
- => #<Groonga::Array ...>
342
- >> def add_bookmark(url, title, author, content, issued)
343
- >> item = @items[url] || @items.add(url, :title => title)
344
- >> @comments.add(:item => item,
345
- ?> :author => author,
346
- ?> :content => content,
347
- ?> :issued => issued)
348
- >> end
349
- => nil
350
- </pre>
351
-
352
- You assign @items@ and @comments@ to each instance variable, so you can
353
- use them in @add_bookmark@ method.
354
-
355
- @add_bookmark@ executes processes like below:
356
-
357
- * Check if the record associated the page exists in @Items@ table.
358
- * If not, add the record to it.
359
- * Add the record to @Comments@ table.
360
-
361
- With this method, lets bookmark some pages.
362
-
363
- <pre>
364
- !!!irb
365
- >> add_bookmark("https://rubygems.org/",
366
- ?> "RubyGems.org | your community gem host", "alice", "Ruby gems",
367
- ?> Time.parse("2010-10-07T14:18:28+09:00"))
368
- => #<Groonga::Record ...>
369
- >> add_bookmark("http://ranguba.org/",
370
- ?> "Fulltext search by Ruby with groonga - Ranguba", "bob",
371
- ?> "Ruby groonga fulltextsearch",
372
- ?> Time.parse("2010-11-11T12:39:59+09:00"))
373
- => #<Groonga::Record ...>
374
- >> add_bookmark("http://www.ruby-doc.org/",
375
- ?> "ruby-doc", "bob", "ruby documents",
376
- ?> Time.parse("2010-07-28T20:46:23+09:00"))
377
- => #<Groonga::Record ...>
378
- </pre>
379
-
380
- h2. Full text search part 2
381
-
382
- Let's do full text search for added records.
383
-
384
- <pre>
385
- !!!irb
386
- >> records = comments.select do |record|
387
- ?> record["content"] =~ "Ruby"
388
- >> end
389
- => #<Groonga::Hash ...>
390
- >> records.each do |record|
391
- ?> comment = record
392
- >> p [comment.id,
393
- ?> comment.issued,
394
- ?> comment.item.title,
395
- ?> comment.author.name,
396
- ?> comment.content]
397
- >> end
398
- [1, 2010-11-20 18:01:22 +0900, "Ruby-Doc.org: Documenting the Ruby Language", "Alice", "Ruby documents"]
399
- [2, 2010-10-07 14:18:28 +0900, "RubyGems.org | your community gem host", "Alice", "Ruby gems"]
400
- [3, 2010-11-11 12:39:59 +0900, "Fulltext search by Ruby with groonga - Ranguba", "Bob", "Ruby groonga fulltextsearch"]
401
- [4, 2010-07-28 20:46:23 +0900, "Ruby-Doc.org: Documenting the Ruby Language", "Bob", "ruby documents"]
402
- </pre>
403
-
404
- You can access the columns with the same name method as each them.
405
- These methods suport to access the complex data type.
406
- (In usually RDB, you should namage JOIN tables, @Items@, @Comments@,
407
- @Users@.)
408
-
409
- The search is finished when the first sentence in this codes. The
410
- results of this search is the object as records set.
411
-
412
- <pre>
413
- !!!irb
414
- >> records
415
- #<Groonga::Hash ..., size: <4>>
416
- </pre>
417
-
418
- You can arrange this records set before output.
419
- For example, sort these records in the descending order by date.
420
-
421
- <pre>
422
- !!!irb
423
- >> records.sort([{:key => "issued", :order => "descending"}]).each do |record|
424
- ?> comment = record
425
- >> p [comment.id,
426
- ?> comment.issued,
427
- ?> comment.item.title,
428
- ?> comment.author.name,
429
- ?> comment.content]
430
- >> end
431
- [1, 2010-11-20 18:01:22 +0900, "Ruby-Doc.org: Documenting the Ruby Language", "Alice", "Ruby documents"]
432
- [2, 2010-11-11 12:39:59 +0900, "Fulltext search by Ruby with groonga - Ranguba", "Bob", "Ruby groonga fulltextsearch"]
433
- [3, 2010-10-07 14:18:28 +0900, "RubyGems.org | your community gem host", "Alice", "Ruby gems"]
434
- [4, 2010-07-28 20:46:23 +0900, "Ruby-Doc.org: Documenting the Ruby Language", "Bob", "ruby documents"]
435
- => [...]
436
- </pre>
437
-
438
- Let's group the result by each item for easy view.
439
-
440
- <pre>
441
- !!!irb
442
- >> records.group("item").each do |record|
443
- ?> item = record.key
444
- >> p [record.n_sub_records,
445
- ?> item.key,
446
- ?> item.title]
447
- >> end
448
- [2, "http://www.ruby-doc.org/", "Ruby-Doc.org: Documenting the Ruby Language"]
449
- [1, "https://rubygems.org/", "RubyGems.org | your community gem host"]
450
- [1, "http://ranguba.org/", "Fulltext search by Ruby with groonga - Ranguba"]
451
- => nil
452
- </pre>
453
-
454
- @n_sub_records@ is the number of records in each group.
455
- It is similar value as count() function of a query including "GROUP
456
- BY" in SQL.
457
-
458
- h2. more complex search
459
-
460
- Now, you challenge the more useful search.
461
-
462
- You should calcurate goodness of fit of search explicitly.
463
-
464
- You can use @Items.title@ and @Comments.content@ as search targets now.
465
- @Items.title@ is the a few reliable information taken from each
466
- original pages. On the other hands, @Comments.content@ is the less
467
- reliable information because this depends on users of bookmark
468
- application.
469
-
470
- Then, you search records with this policy:
471
-
472
- * Search item matched @Items.title@ or @Comments.content@.
473
- * Add 10 times heavier weight to socres of each record matched
474
- @Items.title@ than ones of @Comments.comment@.
475
- * If multi @comment@ of one item are matched keyword, specify the sum
476
- of scores of each @coments@ as score of the item.
477
-
478
- On this policy, you try to type below:
479
-
480
- <pre>
481
- !!!irb
482
- >> ruby_comments = @comments.select {|record| record.content =~ "Ruby"}
483
- => #<Groonga::Hash ..., size: <4>
484
- >> ruby_items = @items.select do |record|
485
- ?> target = record.match_target do |match_record|
486
- ?> match_record.title * 10
487
- >> end
488
- >> target =~ "Ruby"
489
- >> end
490
- #<Groonga::Hash ..., size: <4>>
491
- </pre>
492
-
493
- You group the results of _ruby_comments_ in each item and union
494
- _ruby_items_ .
495
-
496
- <pre>
497
- !!!irb
498
- >> ruby_items = ruby_comments.group("item").union!(ruby_items)
499
- #<Groonga::Hash ..., size: <5>>
500
- >> ruby_items.sort([{:key => "_score", :order => "descending"}]).each do |record|
501
- >> p [record.score, record.title]
502
- >> end
503
- [22, "Ruby-Doc.org: Documenting the Ruby Language"]
504
- [11, "Fulltext search by Ruby with groonga - Ranguba"]
505
- [10, "Ruby Programming Language"]
506
- [10, "Ruby"]
507
- [1, "RubyGems.org | your community gem host"]
508
- </pre>
509
-
510
- Then, you get the result.