rroonga 5.0.1 → 5.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,145 +0,0 @@
1
- h1. Install
2
-
3
- This document describes how to install Rroonga.
4
-
5
- You can install Rroonga by RubyGems. It is the standard way for Ruby
6
- libraries.
7
-
8
- Rroonga is depends on Groonga. So you need to install both Groonga and
9
- Rroonga. You can't install Groonga by RubyGems because it isn't Ruby
10
- library. But don't worry. Rroonga provides the following options for
11
- easy to install:
12
-
13
- * Rroonga downloads, builds and installs Groonga automatically. You
14
- don't need to do them explicitly.
15
- * Rroonga uses Groonga installed by your packaging system.
16
-
17
- The following sections describe the above in detail.
18
-
19
- h2. Install with auto Groonga build
20
-
21
- Rroonga searches Groonga on install. If Rroonga can't find
22
- Groonga, Rroonga downloads, builds and installs Groonga
23
- automatically.
24
-
25
- Type the following command to install Rroonga and Groonga. You don't
26
- need to install Groonga explicitly:
27
-
28
- <pre>
29
- !!!command_line
30
- % gem install rroonga
31
- </pre>
32
-
33
- h2. Install with Groonga package
34
-
35
- You can use Groonga package on you packaging system instead of
36
- building Groonga by yourself. There are the following advantages for
37
- this option:
38
-
39
- * It reduces installation time.
40
- * It doesn't fail on building Groonga.
41
-
42
- h3. Windows
43
-
44
- Rroonga gem for Windows includes both pre-compiled Rroonga and Groonga
45
- in the gem. So what you need to do is you just install rroonga gem.
46
-
47
- Type the following command on Ruby console:
48
-
49
- <pre>
50
- !!!command_line
51
- > gem install rroonga
52
- </pre>
53
-
54
- This document assumes that you're using "RubyInstaller for
55
- Windows":http://rubyinstaller.org/ .
56
-
57
- h3. OS X
58
-
59
- There are Groonga packages for OS X environment.
60
-
61
- h4. MacPorts
62
-
63
- If you're using "MacPorts":http://www.macports.org/ , type the
64
- following commands on your terminal:
65
-
66
- <pre>
67
- !!!command_line
68
- % sudo port install groonga
69
- % sudo gem install rroonga
70
- </pre>
71
-
72
- h4. Homebrew
73
-
74
- If you're using "Homebrew":http://brew.sh/ , type the
75
- following commands on your terminal:
76
-
77
- <pre>
78
- !!!command_line
79
- % brew install groonga
80
- % gem install rroonga
81
- </pre>
82
-
83
- h3. Debian GNU/Linux
84
-
85
- You can install the Groonga package by apt. See "Groonga
86
- documentation":http://groonga.org/docs/install/debian.html how to set
87
- apt-line up.
88
-
89
- Type the following commands on your terminal after you finish to set
90
- apt-line up.
91
-
92
- <pre>
93
- !!!command_line
94
- % sudo apt-get install -y libgroonga-dev
95
- % sudo gem install rroonga
96
- </pre>
97
-
98
- h3. Ubuntu
99
-
100
- You can install the Groonga package by apt. See "Groonga
101
- documentation":http://groonga.org/docs/install/ubuntu.html how to set
102
- apt-line up.
103
-
104
- Type the following commands on your terminal after you finish to set
105
- apt-line up.
106
-
107
- <pre>
108
- !!!command_line
109
- % sudo apt-get install -y libgroonga-dev
110
- % sudo gem install rroonga
111
- </pre>
112
-
113
- h3. CentOS
114
-
115
- You can install the Groonga package by yum. See "Groonga
116
- documentation":http://groonga.org/docs/install/centos.html how to set
117
- yum repository up.
118
-
119
- But you need to install Ruby 1.9.3 or later by yourself. Both CentOS 5
120
- and 6 ship Ruby 1.8. Rroonga doesn't support Ruby 1.8.
121
-
122
- Type the following commands on your terminal after you finish to set
123
- yum repository up and installing Ruby 1.9.3 or later.
124
-
125
- <pre>
126
- !!!command_line
127
- % sudo yum install groonga-devel -y
128
- % gem install rroonga
129
- </pre>
130
-
131
- h3. Fedora
132
-
133
- You can install the Groonga package by yum. The Groonga package is
134
- included in the official Fedora repository.
135
-
136
- <pre>
137
- !!!command_line
138
- % sudo yum install groonga-devel -y
139
- % sudo gem install rroonga
140
- </pre>
141
-
142
- h2. Links
143
-
144
- * "2. Install - Groonga documentation":http://groonga.org/docs/install.html
145
-
@@ -1,510 +0,0 @@
1
- h1. Tutorial
2
-
3
- This page introduce how to use Rroonga via a simple application making.
4
-
5
- h2. Install
6
-
7
- You can install Rroonga in your compter with RubyGems.
8
-
9
- <pre>
10
- !!!command_line
11
- % sudo gem install rroonga
12
- </pre>
13
-
14
- h2. Create Database
15
-
16
- Let's create database for simple bookmark application.
17
- Please execute irb with loading Rroonga with this command:
18
-
19
- <pre>
20
- !!!irb
21
- % irb --simple-prompt -r groonga
22
- >>
23
- </pre>
24
-
25
- Now you use UTF-8 as the encoding of database.
26
-
27
- <pre>
28
- !!!irb
29
- >> Groonga::Context.default_options = {:encoding => :utf8}
30
- => {:encoding=>:utf8}
31
- </pre>
32
-
33
- Then, try to create database in a file.
34
-
35
- <pre>
36
- !!!irb
37
- >> Groonga::Database.create(:path => "/tmp/bookmark.db")
38
- => #<Groonga::Database ...>
39
- </pre>
40
-
41
- From now, the created database is used implicitly.
42
- You don't have to be aware of it after you created a database first.
43
-
44
- h2. Define table
45
-
46
- Groonga supports 4 types of tables.
47
-
48
- - Groonga::Hash :=
49
- Hash table. It manages records via each primary key. It supports
50
- very quickly exact match search.
51
- =:
52
-
53
- - Groonga::PatriciaTrie :=
54
- Patricia Trie. It supports some search such as predictive search and
55
- common prefix search, but it provides a little slowly exact match search
56
- than Groonga::Hash. It provides cursor to take records in ascending
57
- or descending order.
58
- =:
59
-
60
- - Groonga::DoubleArrayTrie :=
61
- Double Array Trie. It requires large spaces rather than other
62
- tables, but it can update key without ID change. It provides exract
63
- match search, predictive search and common prefix search and cursor
64
- like Groonga::PatriciaTrie.
65
- =:
66
-
67
- - Groonga::Array :=
68
- Array. It doesn't have primary keys. It manages records by ID.
69
- =:
70
-
71
- Now, you use Groonga::Hash and create the table named @Items@. The type
72
- of its primary key is String.
73
-
74
- <pre>
75
- !!!irb
76
- >> Groonga::Schema.create_table("Items", :type => :hash)
77
- => [...]
78
- </pre>
79
-
80
- You have @Items@ table by this code.
81
- You can refer the defined table with Groonga.[] like below:
82
-
83
- <pre>
84
- !!!irb
85
- >> items = Groonga["Items"]
86
- => #<Groonga::Hash ...>
87
- </pre>
88
-
89
- You can treat it like Hash.
90
- For example, let's type @items.size@ to get the number of records in
91
- the table.
92
-
93
- <pre>
94
- !!!irb
95
- >> items.size
96
- => 0
97
- </pre>
98
-
99
- h2. Add records
100
-
101
- Let's add records to @Items@ table.
102
-
103
- <pre>
104
- !!!irb
105
- >> items.add("http://en.wikipedia.org/wiki/Ruby")
106
- => #<Groonga::Record ...>
107
- >> items.add("http://www.ruby-lang.org/")
108
- => #<Groonga::Record ...>
109
- </pre>
110
-
111
- Please check the number of records. It increases from 0 to 2.
112
-
113
- <pre>
114
- !!!irb
115
- >> items.size
116
- => 2
117
- </pre>
118
-
119
- If you can get record by primary key, type like below:
120
-
121
- <pre>
122
- !!!irb
123
- >> items["http://en.wikipedia.org/wiki/Ruby"]
124
- => #<Groonga::Record ...>
125
- </pre>
126
-
127
- h2. Full text search
128
-
129
- Let's add item's title to full text search.
130
-
131
- first, you add the @Text@ type column "@title@" to @Items@ table.
132
-
133
- <pre>
134
- !!!irb
135
- >> Groonga::Schema.change_table("Items") do |table|
136
- ?> table.text("title")
137
- >> end
138
- => [...]
139
- </pre>
140
-
141
- Defined columns is named as @#{TABLE_NAME}.#{COLUMN_NAME}@.
142
- You can refer them with {Groonga.[]} as same as tables.
143
-
144
- <pre>
145
- !!!irb
146
- >> title_column = Groonga["Items.title"]
147
- => #<Groonga::VariableSizeColumn ...>
148
- </pre>
149
-
150
-
151
- Secondly, let's add the table containing terms from splited from texts.
152
- Then you define the @Terms@ for it.
153
-
154
- <pre>
155
- !!!irb
156
- >> Groonga::Schema.create_table("Terms",
157
- ?> :type => :patricia_trie,
158
- ?> :normalizer => :NormalizerAuto,
159
- ?> :default_tokenizer => "TokenBigram")
160
- </pre>
161
-
162
- You specify @:default_tokenzier => "TokenBigram"@ for "Tokenizer" in
163
- the above code.
164
- "Tokenizer" is the object to split terms from texts. The default value
165
- for it is none.
166
- Full text search requires a tokenizer, so you specify "Bigram", a type
167
- of N-gram.
168
- Full text search with N-gram uses splited N characters and their
169
- position in texts. "N" in N-gram specifies the number of each terms.
170
- Groonga supports Unigram (N=1), Bigram (N=2) and Trigram (N=3).
171
-
172
- You also specify @:normalizer => :NormalizerAuto@ to search texts with
173
- ignoring the case.
174
-
175
- Now, you ready table for terms, so you define the index of
176
- @Items.tiltle@ column.
177
-
178
- <pre>
179
- !!!irb
180
- >> Groonga::Schema.change_table("Terms") do |table|
181
- ?> table.index("Items.title")
182
- >> end
183
- => [...]
184
- </pre>
185
-
186
- You may feel a few unreasonable code. The index of @Items@ table's
187
- column is defined as the column in @Terms@.
188
-
189
- When a record is added to @Items@, groonga adds records associated
190
- each terms in it to @Terms@ automatically.
191
-
192
-
193
- @Terms@ is a few particular table, but you can add some columns to term
194
- table such as @Terms@ and manage many attributes of each terms. It is
195
- very useful to process particular search.
196
-
197
- Now, you finished table definition.
198
- Let's put some values to @title@ of each record you added before.
199
-
200
- <pre>
201
- !!!irb
202
- >> items["http://en.wikipedia.org/wiki/Ruby"].title = "Ruby"
203
- => "Ruby"
204
- >> items["http://www.ruby-lang.org/"].title = "Ruby Programming Language"
205
- "Ruby Programming Language"
206
- </pre>
207
-
208
- Now, you can do full text search like above:
209
-
210
- <pre>
211
- !!!irb
212
- >> ruby_items = items.select {|record| record.title =~ "Ruby"}
213
- => #<Groonga::Hash ..., normalizer: (nil)>
214
- </pre>
215
-
216
- Groonga returns the search result as Groonga::Hash.
217
- Keys in this hash table is records of hitted @Items@.
218
-
219
- <pre>
220
- !!!irb
221
- >> ruby_items.collect {|record| record.key.key}
222
- => ["http://en.wikipedia.org/wiki/Ruby", "http://www.ruby-lang.org/"]
223
- </pre>
224
-
225
- In above example, you get records in @Items@ with @record.key@, and
226
- keys of them with @record.key.key@.
227
-
228
- You can access a refered key in records briefly with @record["_key"]@.
229
-
230
- <pre>
231
- !!!irb
232
- >> ruby_items.collect {|record| record["_key"]}
233
- => ["http://en.wikipedia.org/wiki/Ruby", "http://www.ruby-lang.org/"]
234
- </pre>
235
-
236
- h2. Improve the simple bookmark application
237
-
238
- Let's try to improve this simple application a little. You can create
239
- bookmark application for multi users and they can comment to each
240
- bookmarks.
241
-
242
- First, you add tables for users and for comments like below:
243
-
244
- !http://qwik.jp/senna/senna2.files/rect4605.png!
245
-
246
- Let's add the table for users, @Users@.
247
-
248
- <pre>
249
- !!!irb
250
- >> Groonga::Schema.create_table("Users", :type => :hash) do |table|
251
- ?> table.text("name")
252
- >> end
253
- => [...]
254
- </pre>
255
-
256
-
257
- Next, let's add the table for comments as @Comments@.
258
-
259
- <pre>
260
- !!!irb
261
- >> Groonga::Schema.create_table("Comments") do |table|
262
- ?> table.reference("item")
263
- >> table.reference("author", "Users")
264
- >> table.text("content")
265
- >> table.time("issued")
266
- >> end
267
- => [...]
268
- </pre>
269
-
270
- Then you define the index of @content@ column in @Comments@ for full
271
- text search.
272
-
273
- <pre>
274
- !!!irb
275
- >> Groonga::Schema.change_table("Terms") do |table|
276
- ?> table.index("Comments.content")
277
- >> end
278
- => [...]
279
- </pre>
280
-
281
- You finish table definition by above code.
282
-
283
- Secondly, you add some users to @Users@.
284
-
285
- <pre>
286
- !!!irb
287
- >> users = Groonga["Users"]
288
- => #<Groonga::Hash ...>
289
- >> users.add("alice", :name => "Alice")
290
- => #<Groonga::Record ...>
291
- >> users.add("bob", :name => "Bob")
292
- => #<Groonga::Record ...>
293
- </pre>
294
-
295
- Now, let's write the process to bookmark by a user.
296
- You assume that the user, @moritan@, bookmark a page including
297
- infomation related Ruby.
298
-
299
- First, you check if the page has been added @Items@ already.
300
-
301
- <pre>
302
- !!!irb
303
- >> items.has_key?("http://www.ruby-doc.org/")
304
- => false
305
- </pre>
306
-
307
- The page hasn't been added, so you add it to @Items@.
308
-
309
- <pre>
310
- !!!irb
311
- >> items.add("http://www.ruby-doc.org/",
312
- ?> :title => "Ruby-Doc.org: Documenting the Ruby Language")
313
- => #<Groonga::Record ...>
314
- </pre>
315
-
316
- Next, you add the record to @Comments@. This record contains this page
317
- as its @item@ column.
318
-
319
- <pre>
320
- !!!irb
321
- >> require "time"
322
- => true
323
- >> comments = Groonga["Comments"]
324
- => #<Groonga::Array ...>
325
- >> comments.add(:item => "http://www.ruby-doc.org/",
326
- ?> :author => "alice",
327
- ?> :content => "Ruby documents",
328
- ?> :issued => Time.parse("2010-11-20T18:01:22+09:00"))
329
- => #<Groonga::Record ...>
330
- </pre>
331
-
332
- h2. Define methods for this process
333
-
334
- For usefull, you define methods for above processes.
335
-
336
- <pre>
337
- !!!irb
338
- >> @items = items
339
- => #<Groonga::Hash ...>
340
- >> @comments = comments
341
- => #<Groonga::Array ...>
342
- >> def add_bookmark(url, title, author, content, issued)
343
- >> item = @items[url] || @items.add(url, :title => title)
344
- >> @comments.add(:item => item,
345
- ?> :author => author,
346
- ?> :content => content,
347
- ?> :issued => issued)
348
- >> end
349
- => nil
350
- </pre>
351
-
352
- You assign @items@ and @comments@ to each instance variable, so you can
353
- use them in @add_bookmark@ method.
354
-
355
- @add_bookmark@ executes processes like below:
356
-
357
- * Check if the record associated the page exists in @Items@ table.
358
- * If not, add the record to it.
359
- * Add the record to @Comments@ table.
360
-
361
- With this method, lets bookmark some pages.
362
-
363
- <pre>
364
- !!!irb
365
- >> add_bookmark("https://rubygems.org/",
366
- ?> "RubyGems.org | your community gem host", "alice", "Ruby gems",
367
- ?> Time.parse("2010-10-07T14:18:28+09:00"))
368
- => #<Groonga::Record ...>
369
- >> add_bookmark("http://ranguba.org/",
370
- ?> "Fulltext search by Ruby with groonga - Ranguba", "bob",
371
- ?> "Ruby groonga fulltextsearch",
372
- ?> Time.parse("2010-11-11T12:39:59+09:00"))
373
- => #<Groonga::Record ...>
374
- >> add_bookmark("http://www.ruby-doc.org/",
375
- ?> "ruby-doc", "bob", "ruby documents",
376
- ?> Time.parse("2010-07-28T20:46:23+09:00"))
377
- => #<Groonga::Record ...>
378
- </pre>
379
-
380
- h2. Full text search part 2
381
-
382
- Let's do full text search for added records.
383
-
384
- <pre>
385
- !!!irb
386
- >> records = comments.select do |record|
387
- ?> record["content"] =~ "Ruby"
388
- >> end
389
- => #<Groonga::Hash ...>
390
- >> records.each do |record|
391
- ?> comment = record
392
- >> p [comment.id,
393
- ?> comment.issued,
394
- ?> comment.item.title,
395
- ?> comment.author.name,
396
- ?> comment.content]
397
- >> end
398
- [1, 2010-11-20 18:01:22 +0900, "Ruby-Doc.org: Documenting the Ruby Language", "Alice", "Ruby documents"]
399
- [2, 2010-10-07 14:18:28 +0900, "RubyGems.org | your community gem host", "Alice", "Ruby gems"]
400
- [3, 2010-11-11 12:39:59 +0900, "Fulltext search by Ruby with groonga - Ranguba", "Bob", "Ruby groonga fulltextsearch"]
401
- [4, 2010-07-28 20:46:23 +0900, "Ruby-Doc.org: Documenting the Ruby Language", "Bob", "ruby documents"]
402
- </pre>
403
-
404
- You can access the columns with the same name method as each them.
405
- These methods suport to access the complex data type.
406
- (In usually RDB, you should namage JOIN tables, @Items@, @Comments@,
407
- @Users@.)
408
-
409
- The search is finished when the first sentence in this codes. The
410
- results of this search is the object as records set.
411
-
412
- <pre>
413
- !!!irb
414
- >> records
415
- #<Groonga::Hash ..., size: <4>>
416
- </pre>
417
-
418
- You can arrange this records set before output.
419
- For example, sort these records in the descending order by date.
420
-
421
- <pre>
422
- !!!irb
423
- >> records.sort([{:key => "issued", :order => "descending"}]).each do |record|
424
- ?> comment = record
425
- >> p [comment.id,
426
- ?> comment.issued,
427
- ?> comment.item.title,
428
- ?> comment.author.name,
429
- ?> comment.content]
430
- >> end
431
- [1, 2010-11-20 18:01:22 +0900, "Ruby-Doc.org: Documenting the Ruby Language", "Alice", "Ruby documents"]
432
- [2, 2010-11-11 12:39:59 +0900, "Fulltext search by Ruby with groonga - Ranguba", "Bob", "Ruby groonga fulltextsearch"]
433
- [3, 2010-10-07 14:18:28 +0900, "RubyGems.org | your community gem host", "Alice", "Ruby gems"]
434
- [4, 2010-07-28 20:46:23 +0900, "Ruby-Doc.org: Documenting the Ruby Language", "Bob", "ruby documents"]
435
- => [...]
436
- </pre>
437
-
438
- Let's group the result by each item for easy view.
439
-
440
- <pre>
441
- !!!irb
442
- >> records.group("item").each do |record|
443
- ?> item = record.key
444
- >> p [record.n_sub_records,
445
- ?> item.key,
446
- ?> item.title]
447
- >> end
448
- [2, "http://www.ruby-doc.org/", "Ruby-Doc.org: Documenting the Ruby Language"]
449
- [1, "https://rubygems.org/", "RubyGems.org | your community gem host"]
450
- [1, "http://ranguba.org/", "Fulltext search by Ruby with groonga - Ranguba"]
451
- => nil
452
- </pre>
453
-
454
- @n_sub_records@ is the number of records in each group.
455
- It is similar value as count() function of a query including "GROUP
456
- BY" in SQL.
457
-
458
- h2. more complex search
459
-
460
- Now, you challenge the more useful search.
461
-
462
- You should calcurate goodness of fit of search explicitly.
463
-
464
- You can use @Items.title@ and @Comments.content@ as search targets now.
465
- @Items.title@ is the a few reliable information taken from each
466
- original pages. On the other hands, @Comments.content@ is the less
467
- reliable information because this depends on users of bookmark
468
- application.
469
-
470
- Then, you search records with this policy:
471
-
472
- * Search item matched @Items.title@ or @Comments.content@.
473
- * Add 10 times heavier weight to socres of each record matched
474
- @Items.title@ than ones of @Comments.comment@.
475
- * If multi @comment@ of one item are matched keyword, specify the sum
476
- of scores of each @coments@ as score of the item.
477
-
478
- On this policy, you try to type below:
479
-
480
- <pre>
481
- !!!irb
482
- >> ruby_comments = @comments.select {|record| record.content =~ "Ruby"}
483
- => #<Groonga::Hash ..., size: <4>
484
- >> ruby_items = @items.select do |record|
485
- ?> target = record.match_target do |match_record|
486
- ?> match_record.title * 10
487
- >> end
488
- >> target =~ "Ruby"
489
- >> end
490
- #<Groonga::Hash ..., size: <4>>
491
- </pre>
492
-
493
- You group the results of _ruby_comments_ in each item and union
494
- _ruby_items_ .
495
-
496
- <pre>
497
- !!!irb
498
- >> ruby_items = ruby_comments.group("item").union!(ruby_items)
499
- #<Groonga::Hash ..., size: <5>>
500
- >> ruby_items.sort([{:key => "_score", :order => "descending"}]).each do |record|
501
- >> p [record.score, record.title]
502
- >> end
503
- [22, "Ruby-Doc.org: Documenting the Ruby Language"]
504
- [11, "Fulltext search by Ruby with groonga - Ranguba"]
505
- [10, "Ruby Programming Language"]
506
- [10, "Ruby"]
507
- [1, "RubyGems.org | your community gem host"]
508
- </pre>
509
-
510
- Then, you get the result.