rabbit-slide-kou-pgconf-asia-2017 2017.12.5.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.rabbit +1 -0
- data/README.rd +45 -0
- data/Rakefile +17 -0
- data/config.yaml +27 -0
- data/images/chupa-text-web-ui-extract-metadata.png +0 -0
- data/images/chupa-text-web-ui-extract-text-and-screenshot.png +0 -0
- data/images/chupa-text-web-ui-form.png +0 -0
- data/images/php-document-search-search.png +0 -0
- data/images/php-document-search.png +0 -0
- data/pdf/pgconf-asia-2017-pgroonga-2.pdf +0 -0
- data/pgroonga-2.rab +1048 -0
- data/theme.rb +3 -0
- metadata +88 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: de0d4303fea6331a03b9f01805dff512058c84c2
|
4
|
+
data.tar.gz: 93bd721275f432b353a8889560bf3479e0040661
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 8818106c63119ef4092dae199fdbc24bdcc853d9d9f9980590cfbdf446c1816f68d5f850b5f9ac5dcc1594d56731a4da1d1ed5f3a5f0d10746b00823d4dda3d4
|
7
|
+
data.tar.gz: 2a80eb664311c78e8a16d0823f5424e946cb3ae35056e49d34586efaa182432a3eaee6ddd9f0625358513f741989c17685477fca771593fe169a6547932b27c0
|
data/.rabbit
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
pgroonga-2.rab
|
data/README.rd
ADDED
@@ -0,0 +1,45 @@
|
|
1
|
+
= PGroonga 2 – Make PostgreSQL rich full text search system backend!
|
2
|
+
|
3
|
+
PGroonga 2.0 has been released with 2 years development since PGroonga 1.0.0. PGroonga 1.0.0 just provides fast full text search with all languages support. It's important because it's a lacked feature in PostgreSQL. PGroonga 2.0 provides more useful features to implement rich full text search system with PostgreSQL. This session shows how to implement rich full text search system with PostgreSQL!
|
4
|
+
|
5
|
+
This talk describes about PGroonga that resolves these problems.
|
6
|
+
|
7
|
+
== License
|
8
|
+
|
9
|
+
=== Slide
|
10
|
+
|
11
|
+
CC BY-SA 4.0
|
12
|
+
|
13
|
+
Use the followings for notation of the author:
|
14
|
+
|
15
|
+
* Kouhei Sutou
|
16
|
+
|
17
|
+
=== Images
|
18
|
+
|
19
|
+
==== Groonga and PGroonga logos
|
20
|
+
|
21
|
+
CC BY 3.0
|
22
|
+
|
23
|
+
Author: The Groonga Project
|
24
|
+
|
25
|
+
It is used in page header and some pages in the slide.
|
26
|
+
|
27
|
+
== For author
|
28
|
+
|
29
|
+
=== Show
|
30
|
+
|
31
|
+
rake
|
32
|
+
|
33
|
+
=== Publish
|
34
|
+
|
35
|
+
rake publish
|
36
|
+
|
37
|
+
== For viewers
|
38
|
+
|
39
|
+
=== Install
|
40
|
+
|
41
|
+
gem install rabbit-slide-kou-pgconf-asia-2017
|
42
|
+
|
43
|
+
=== Show
|
44
|
+
|
45
|
+
rabbit rabbit-slide-kou-pgconf-asia-2017.gem
|
data/Rakefile
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
require "rabbit/task/slide"
|
2
|
+
|
3
|
+
# Edit ./config.yaml to customize meta data
|
4
|
+
|
5
|
+
spec = nil
|
6
|
+
Rabbit::Task::Slide.new do |task|
|
7
|
+
spec = task.spec
|
8
|
+
# spec.files += Dir.glob("doc/**/*.*")
|
9
|
+
# spec.files -= Dir.glob("private/**/*.*")
|
10
|
+
spec.add_runtime_dependency("rabbit-theme-groonga")
|
11
|
+
end
|
12
|
+
|
13
|
+
desc "Tag #{spec.version}"
|
14
|
+
task :tag do
|
15
|
+
sh("git", "tag", "-a", spec.version.to_s, "-m", "Publish #{spec.version}")
|
16
|
+
sh("git", "push", "--tags")
|
17
|
+
end
|
data/config.yaml
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
---
|
2
|
+
id: pgconf-asia-2017
|
3
|
+
base_name: pgroonga-2
|
4
|
+
tags:
|
5
|
+
- rabbit
|
6
|
+
- postgresql
|
7
|
+
- pgroonga
|
8
|
+
- pgconfasia
|
9
|
+
presentation_date: '2017-12-05'
|
10
|
+
presentation_start_time: '2017-12-05T16:20:00+09:00'
|
11
|
+
presentation_end_time: '2017-12-05T17:00:00+09:00'
|
12
|
+
version: 2017.12.5.0
|
13
|
+
licenses:
|
14
|
+
- CC-BY-SA-4.0
|
15
|
+
- CC-BY-3.0
|
16
|
+
slideshare_id: pgconfasia2017
|
17
|
+
speaker_deck_id:
|
18
|
+
ustream_id:
|
19
|
+
vimeo_id:
|
20
|
+
youtube_id:
|
21
|
+
author:
|
22
|
+
markup_language: :rd
|
23
|
+
name: Kouhei Sutou
|
24
|
+
email: kou@clear-code.com
|
25
|
+
rubygems_user: kou
|
26
|
+
slideshare_user: kou
|
27
|
+
speaker_deck_user:
|
Binary file
|
Binary file
|
Binary file
|
Binary file
|
Binary file
|
Binary file
|
data/pgroonga-2.rab
ADDED
@@ -0,0 +1,1048 @@
|
|
1
|
+
= PGroonga 2
|
2
|
+
|
3
|
+
: subtitle
|
4
|
+
Make PostgreSQL rich full text search system backend!
|
5
|
+
: author
|
6
|
+
Kouhei Sutou
|
7
|
+
: institution
|
8
|
+
ClearCode Inc.
|
9
|
+
: content-source
|
10
|
+
PGConf.ASIA 2017
|
11
|
+
: date
|
12
|
+
2017-12-05
|
13
|
+
: start-time
|
14
|
+
2017-12-05T16:20:00+09:00
|
15
|
+
: end-time
|
16
|
+
2017-12-05T17:00:00+09:00
|
17
|
+
: theme
|
18
|
+
.
|
19
|
+
|
20
|
+
= Targets\n(('note:対象者'))
|
21
|
+
|
22
|
+
* Want to implement full text search with PostgreSQL\n
|
23
|
+
(('note:PostgreSQLで全文検索したい'))
|
24
|
+
* Not good at full text search\n
|
25
|
+
(('note:全文検索はよく知らない'))
|
26
|
+
* PGroonga 1.0.0 users\n
|
27
|
+
(('note:PGroonga 1.0.0は使ったことがある'))
|
28
|
+
|
29
|
+
= Abbreviations\n(('note:略語'))
|
30
|
+
|
31
|
+
* PG: PostgreSQL\n
|
32
|
+
(('note:ポスグレ: PostgreSQL'))
|
33
|
+
* FTS: Full text search\n
|
34
|
+
(('note:FTS: 全文検索'))
|
35
|
+
|
36
|
+
= FTS system: Targets\n(('note:全文検索システム:対象'))
|
37
|
+
|
38
|
+
(('tag:center'))
|
39
|
+
(('tag:large'))
|
40
|
+
(('tag:margin-bottom * 2'))
|
41
|
+
Many tests\n
|
42
|
+
(('note:大量のテキスト'))
|
43
|
+
|
44
|
+
* e.g.: Text data in office docs in file servers\n
|
45
|
+
(('note:例:ファイルサーバー内のオフィス文書内のテキスト'))
|
46
|
+
* e.g.: Item descriptions, chat logs, Wiki data, ...\n
|
47
|
+
(('note:例:商品説明やチャットログ、Wikiのデータなど'))
|
48
|
+
|
49
|
+
= FTS system: Goal\n(('note:全文検索システム:目的'))
|
50
|
+
|
51
|
+
Provide\n
|
52
|
+
needed info\n
|
53
|
+
when you need\n
|
54
|
+
(('note:必要な情報を必要なときに提供すること'))
|
55
|
+
|
56
|
+
= Provide needed info\n(('note:必要な情報を提供'))
|
57
|
+
|
58
|
+
* 😞 Not found\n
|
59
|
+
(('note:探している情報が見つからない'))
|
60
|
+
* 😃 Found\n
|
61
|
+
(('note:探している情報が見つかる'))
|
62
|
+
* 😆 Found ((*unconscious needed*)) info too!\n
|
63
|
+
(('note:意識していなかったけど実は欲しかった情報も見つかる!'))
|
64
|
+
|
65
|
+
= When you need\n(('note:必要なときに活用'))
|
66
|
+
|
67
|
+
* 😞 Need many times to find\n
|
68
|
+
(('note:なかなか見つからない'))
|
69
|
+
* 😃 Find in no time\n
|
70
|
+
(('note:すぐに見つかる'))
|
71
|
+
* 😆 Already found\n
|
72
|
+
(('note:すでに見つかっていた'))
|
73
|
+
* e.g.: Recommendation\n
|
74
|
+
(('note:例:レコメンデーション'))
|
75
|
+
|
76
|
+
= How to impl.: Options\n(('note:実装方法:選択肢'))
|
77
|
+
|
78
|
+
* Use FTS server\n
|
79
|
+
(('note:全文検索サーバーを使う'))
|
80
|
+
* Use PostgreSQL\n
|
81
|
+
(('note:PostgreSQLを使う'))
|
82
|
+
|
83
|
+
= FTS server: Pros\n(('note:全文検索サーバー案:メリット'))
|
84
|
+
|
85
|
+
* Provides all basic features\n
|
86
|
+
(('note:必要な機能が揃っている'))
|
87
|
+
* Provides advanced features\n
|
88
|
+
(('note:+αの機能もある'))
|
89
|
+
* Fast\n
|
90
|
+
(('note:速い'))
|
91
|
+
|
92
|
+
= FTS server: Cons1\n(('note:全文検索サーバー案:デメリット1'))
|
93
|
+
|
94
|
+
* Large implementation cost\n
|
95
|
+
(('note:実装コスト大'))
|
96
|
+
* Learn how to use from scratch\n
|
97
|
+
(('note:使い方を1から学ぶ必要がある'))
|
98
|
+
* How to implement data sync?\n
|
99
|
+
(('note:マスターデータの同期はどうする?'))
|
100
|
+
|
101
|
+
= FTS server: Cons2\n(('note:全文検索サーバー案:デメリット2'))
|
102
|
+
|
103
|
+
* Large maintenance cost\n
|
104
|
+
(('note:メンテナンスコスト大'))
|
105
|
+
* Learn how to operate from scratch\n
|
106
|
+
(('note:運用方法を1から学ぶ必要がある'))
|
107
|
+
|
108
|
+
= PostgreSQL: Pros1\n(('note:PostgreSQL案:メリット1'))
|
109
|
+
|
110
|
+
* Less implementation cost\n
|
111
|
+
(('note:実装コスト小'))
|
112
|
+
* Less things to be learned\n
|
113
|
+
(('note:新しく覚えることが少ない'))
|
114
|
+
* Can manage data at the same place\n
|
115
|
+
(('note:データの一元管理'))
|
116
|
+
|
117
|
+
= PostgreSQL: Pros2\n(('note:PostgreSQL案:メリット2'))
|
118
|
+
|
119
|
+
* Less operation cost\n
|
120
|
+
(('note:メンテナンスコスト小'))
|
121
|
+
* The current operation knowledge is reusable\n
|
122
|
+
(('note:既存の運用ノウハウを使える'))
|
123
|
+
|
124
|
+
= PostgreSQL: Cons\n(('note:PostgreSQL案:デメリット'))
|
125
|
+
|
126
|
+
* Built-in features aren't enough\n
|
127
|
+
(('note:組込機能では機能不足'))
|
128
|
+
* SQL limits efficiency\n
|
129
|
+
(('note:SQLの表現力不足'))
|
130
|
+
* e.g.: SQL needs multiple queries for a process that can be done by 1 query by FTS server\n
|
131
|
+
(('note:例:全文検索サーバーなら1クエリーで実現できる処理にSQLだと複数クエリー必要なことがある'))
|
132
|
+
|
133
|
+
= The 3rd option\n(('note:第3の選択肢'))
|
134
|
+
|
135
|
+
* Use FTS engine via PostgreSQL (SQL)\n
|
136
|
+
(('note:PostgreSQL経由(SQL)で全文検索エンジンを使う'))
|
137
|
+
|
138
|
+
= Pros\n(('note:メリット'))
|
139
|
+
|
140
|
+
* Fast and rich features\n
|
141
|
+
(('note:高速で豊富な機能'))
|
142
|
+
* Less implementation cost\n
|
143
|
+
(('note:実装コスト小'))
|
144
|
+
* Less operation cost\n
|
145
|
+
(('note:メンテナンスコスト小'))
|
146
|
+
|
147
|
+
= Cons\n(('note:デメリット'))
|
148
|
+
|
149
|
+
* Need PostgreSQL extension\n
|
150
|
+
(('note:PostgreSQLに拡張機能が必要'))
|
151
|
+
* Not available on DBaaS\n
|
152
|
+
(('note:DBaaSで使えない'))
|
153
|
+
|
154
|
+
= Option: No FTS knowledge\n(('note:オススメの選択肢:全文検索の知識ナシ'))
|
155
|
+
|
156
|
+
* Need only simple features\n
|
157
|
+
(('note:まだ単純な機能で十分'))
|
158
|
+
* Less data: LIKE with PostgreSQL\n
|
159
|
+
(('note:データ少:PostgreSQLでLIKE'))
|
160
|
+
* Need up-to-date FTS features\n
|
161
|
+
(('note:いまどきの全文検索機能が必要'))
|
162
|
+
* FTS engine via PostgreSQL\n
|
163
|
+
(('note:PostgreSQL経由で全文検索エンジン'))
|
164
|
+
|
165
|
+
= Option: With FTS knowledge\n(('note:オススメの選択肢:全文検索の知識アリ'))
|
166
|
+
|
167
|
+
* Need tuned FTS feature\n
|
168
|
+
(('note:カリカリにチューニングしたい'))
|
169
|
+
* PostgreSQL + FTS server\n
|
170
|
+
(('note:PostgreSQL+全文検索サーバー'))
|
171
|
+
* Others\n
|
172
|
+
(('note:それ以外'))
|
173
|
+
* FTS engine via PostgreSQL\n
|
174
|
+
(('note:PostgreSQL経由で全文検索エンジン'))
|
175
|
+
|
176
|
+
= Described option\n(('note:説明する選択肢'))
|
177
|
+
|
178
|
+
FTS engine via\n
|
179
|
+
PostgreSQL\n
|
180
|
+
(('note:PostgreSQL経由で全文検索エンジン'))
|
181
|
+
|
182
|
+
= FTS engine: Groonga\n(('note:全文検索エンジン:Groonga(ぐるんが)'))
|
183
|
+
|
184
|
+
* Embeddable FTS engine\n
|
185
|
+
(('note:組込可能な全文検索エンジン'))
|
186
|
+
* PGroonga: Groonga in PostgreSQL\n
|
187
|
+
(('note:PGroonga:PostgreSQLに組込'))
|
188
|
+
* Usable as FTS server\n
|
189
|
+
(('note:全文検索サーバーとして単独でも使用可能'))
|
190
|
+
* PostgreSQL + FTS server architecture is also available\n
|
191
|
+
(('note:PostgreSQL+全文検索サーバー構成もできる'))
|
192
|
+
|
193
|
+
= Groonga's hobby: data update\n(('note:Groongaの得意な事:データの追加・更新'))
|
194
|
+
|
195
|
+
* Make fresh data searchable!\n
|
196
|
+
(('note:新鮮な情報をすぐ検索可能!'))
|
197
|
+
* Batch update is needless\n
|
198
|
+
(('note:バッチで更新しなくてもよい'))
|
199
|
+
* Can use as chat backend\n
|
200
|
+
(('note:チャットくらいの頻度でもOK'))\n
|
201
|
+
e.g.: Zulip uses PGroonga\n
|
202
|
+
(('note:例:ZulipはPGroongaを採用'))
|
203
|
+
|
204
|
+
= Groonga's hobby: data update\n(('note:Groongaの得意な事:データの追加・更新'))
|
205
|
+
|
206
|
+
* Keep search performance while updating!\n
|
207
|
+
(('note:更新中も検索性能が落ちない!'))
|
208
|
+
* Updatable when there are many search users\n
|
209
|
+
(('note:利用ユーザーが多い時でも更新可能'))
|
210
|
+
|
211
|
+
= PGroonga\n(('note:PGroonga(ぴーじーるんが)'))
|
212
|
+
|
213
|
+
* PostgreSQL index\n
|
214
|
+
(('note:PostgreSQLのインデックス'))
|
215
|
+
* Alternative of GIN, RUM, ...\n
|
216
|
+
(('note:GIN・RUMなどと同じレイヤー'))
|
217
|
+
* Usage\n
|
218
|
+
(('note:使用方法'))
|
219
|
+
* (({CREATE INDEX ...}))\n
|
220
|
+
(({USING PGroonga ...}))
|
221
|
+
|
222
|
+
= PostgreSQL and FTS\n(('note:PostgreSQLと全文検索'))
|
223
|
+
|
224
|
+
* LIKE: Built-in(('note:(組込機能)'))
|
225
|
+
* textsearch: Built-in(('note:(組込機能)'))
|
226
|
+
* pg_trgm: Contrib(('note:(標準添付)'))
|
227
|
+
* Bundled in the archive\n
|
228
|
+
(('note:アーカイブには含まれている'))
|
229
|
+
* Need to install separately\n
|
230
|
+
(('note:別途インストールすれば使える'))
|
231
|
+
|
232
|
+
= LIKE and performance\n(('note:LIKEと速度'))
|
233
|
+
|
234
|
+
* Small data\n
|
235
|
+
(('note:少ないデータ'))
|
236
|
+
* Enough performance\n
|
237
|
+
(('note:十分実用的'))
|
238
|
+
* Not small data\n
|
239
|
+
(('note:少なくないデータ'))
|
240
|
+
* Need to tune\n
|
241
|
+
(('note:性能問題アリ'))
|
242
|
+
|
243
|
+
= LIKE and FTS system\n(('note:LIKEと全文検索システム'))
|
244
|
+
|
245
|
+
👍Enough performance\n
|
246
|
+
in most case\n
|
247
|
+
(('note:速度が実用的なことも多い'))
|
248
|
+
|
249
|
+
* Data are small in many case\n
|
250
|
+
(('note:少ないデータなら'))
|
251
|
+
|
252
|
+
= LIKE and FTS system\n(('note:LIKEと全文検索システム'))
|
253
|
+
|
254
|
+
👎Unable to sort\n
|
255
|
+
(('note:それっぽい順のソート不可'))
|
256
|
+
|
257
|
+
* Sort is important in FTS\n
|
258
|
+
(('note:全文検索ではソート順が重要'))
|
259
|
+
* Users check only\n
|
260
|
+
the first N entries\n
|
261
|
+
(('note:ユーザーは先頭N件しか見ない'))
|
262
|
+
|
263
|
+
= textsearch
|
264
|
+
|
265
|
+
* 👍Fast search by index\n
|
266
|
+
(('note:インデックスを作るので速い'))
|
267
|
+
* Need module for each lang\n
|
268
|
+
(('note:言語毎にモジュールが必要'))
|
269
|
+
* 👍Modules for English, French, ... are built-in\n
|
270
|
+
(('note:英語やフランス語などは組込'))
|
271
|
+
* 👎Modules for languages in Asia aren't maintained\n
|
272
|
+
(('note:アジア圏の言語用のモジュールはメンテされていない'))
|
273
|
+
|
274
|
+
= pg_trgm
|
275
|
+
|
276
|
+
* 👍Fast search by index\n
|
277
|
+
(('note:インデックスを作るので速い'))
|
278
|
+
* 👎Asian languages aren't enough supported\n
|
279
|
+
(('note:アジア圏の言語のサポートは十分ではない'))
|
280
|
+
* 👎Unable to sort\n
|
281
|
+
(('note:それっぽい順のソート不可'))
|
282
|
+
|
283
|
+
= RUM
|
284
|
+
|
285
|
+
* RUM = GIN + position\n
|
286
|
+
(('note:RUMは位置情報付きのGIN'))
|
287
|
+
* (('tag:xx-small'))
|
288
|
+
((<"https://github.com/postgrespro/rum"|URL:https://github.com/postgrespro/rum>))
|
289
|
+
* pg_trgm/pg_bigm are slow for much matches case\n
|
290
|
+
(('note:pg_trgmとpg_bigmはマッチ数が多いと遅くなる'))
|
291
|
+
* RUM will solve it\n
|
292
|
+
(('note:GINの代わりにRUMを使うことで解決できるかも!'))
|
293
|
+
|
294
|
+
= PGroonga
|
295
|
+
|
296
|
+
* 👍Fast search by index\n
|
297
|
+
(('note:インデックスを作るので速い'))
|
298
|
+
* 👍Sortable\n
|
299
|
+
(('note:それっぽい順のソート可'))
|
300
|
+
* 👍Support all languages\n
|
301
|
+
(('note:全言語対応'))
|
302
|
+
* 👎Need to install separately\n
|
303
|
+
(('note:別途インストールする必要アリ'))
|
304
|
+
|
305
|
+
= FTS system with PostgreSQL\n(('note:PostgreSQLで全文検索システム'))
|
306
|
+
|
307
|
+
* PGroonga is the best!💯\n
|
308
|
+
(('note:PGroongaがベスト!'))
|
309
|
+
* PGroonga
|
310
|
+
* Fast(('note:(高速)'))
|
311
|
+
* Support all langs(('note:(全言語対応)'))
|
312
|
+
* Sortable(('note:(それっぽい順でソート可)'))
|
313
|
+
|
314
|
+
= FTS system: Basic features\n(('note:全文検索システム:基本機能'))
|
315
|
+
|
316
|
+
* Fast FTS + sort\n
|
317
|
+
(('note:高速全文検索+ソート'))
|
318
|
+
* Show texts around keyword\n
|
319
|
+
(('note:キーワード周辺テキスト表示'))
|
320
|
+
* Highlight keyword\n
|
321
|
+
(('note:検索キーワードハイライト'))
|
322
|
+
|
323
|
+
= FTS system: Adv. features\n(('note:全文検索システム:高度な機能'))
|
324
|
+
|
325
|
+
* Auto complete\n
|
326
|
+
(('note:オートコンプリート'))
|
327
|
+
* Similar search\n
|
328
|
+
(('note:類似文書検索'))
|
329
|
+
* Synonym expansion\n
|
330
|
+
(('note:同義語展開'))
|
331
|
+
|
332
|
+
= PGroonga 1.0.0
|
333
|
+
|
334
|
+
(('tag:center'))
|
335
|
+
↓ are only supported\n
|
336
|
+
(('note:以下の機能のみ対応'))
|
337
|
+
|
338
|
+
* Fast FTS + sort\n
|
339
|
+
(('note:高速全文検索+ソート'))
|
340
|
+
* Show texts around keyword\n
|
341
|
+
(('note:キーワード周辺テキスト表示'))
|
342
|
+
|
343
|
+
= PGroonga 2
|
344
|
+
|
345
|
+
All features are supported!\n
|
346
|
+
(('note:全機能対応!'))
|
347
|
+
|
348
|
+
= PGroonga 1.0.0 → 2
|
349
|
+
|
350
|
+
* 😆 Many new features\n
|
351
|
+
(('note:たくさんの新機能'))
|
352
|
+
* 😆 Improve performance\n
|
353
|
+
(('note:性能改善'))
|
354
|
+
* 😞 API is changed\n
|
355
|
+
(('note:APIが変わった'))
|
356
|
+
|
357
|
+
= API change\n(('note:API変更'))
|
358
|
+
|
359
|
+
(('tag:center'))
|
360
|
+
Operator is changed\n
|
361
|
+
(('note:演算子変更'))
|
362
|
+
|
363
|
+
@@ → &@~
|
364
|
+
%% → &@
|
365
|
+
...
|
366
|
+
|
367
|
+
= API change\n(('note:API変更'))
|
368
|
+
|
369
|
+
(('tag:center'))
|
370
|
+
(({pgroonga})) schema is deprecated\n
|
371
|
+
(('note:pgroongaスキーマを非推奨に'))
|
372
|
+
|
373
|
+
pgroonga.score → pgroonga_score
|
374
|
+
pgroonga.flush → pgroonga_flush
|
375
|
+
...
|
376
|
+
|
377
|
+
= App for PGroonga 1.0.0\n(('note:PGroonga 1.0.0用アプリ'))
|
378
|
+
|
379
|
+
* Broken with PGroonga 2?\n
|
380
|
+
(('note:PGroonga 2では動かない?'))
|
381
|
+
* No! Work without any changes!\n
|
382
|
+
(('note:何も変更しなくても動くよ!'))
|
383
|
+
|
384
|
+
(('tag:center'))
|
385
|
+
Great! But why?\n
|
386
|
+
(('note:いいじゃん!でもなんで動くの?'))\n
|
387
|
+
↓\n
|
388
|
+
"Painless upgrade" technique
|
389
|
+
|
390
|
+
= Painless upgrade
|
391
|
+
|
392
|
+
* PGroonga 2 provides\n
|
393
|
+
both 1 API and 2 API\n
|
394
|
+
(('note:PGroonga 2は1用のAPIも2用のAPIも両方提供'))
|
395
|
+
* Can use PGroonga 2 with 1 API\n
|
396
|
+
(('note:PGroonga 1のAPIでPGroonga 2を使える'))
|
397
|
+
|
398
|
+
= Painless upgrade
|
399
|
+
|
400
|
+
* The last PGroonga 1.X\n
|
401
|
+
provides both 1 API and\n
|
402
|
+
partially 2 API\n
|
403
|
+
(('note:PGroonga 1系の最終版は1用のAPIも2用のAPIの一部も提供'))
|
404
|
+
* Can use PGroonga 1 with 2 API\n
|
405
|
+
(('note:PGroonga 2のAPIでPGroonga 1を使える'))
|
406
|
+
|
407
|
+
= Painless upgrade
|
408
|
+
|
409
|
+
* PGroonga 2 keeps 1 API\n
|
410
|
+
(('note:PGroonga 2の間は1のAPIを維持'))
|
411
|
+
* PGroonga 3 will drop 1 API\n
|
412
|
+
(('note:PGroonga 3で1のAPIを削除予定'))
|
413
|
+
* Just need to upgrade API until 3\n
|
414
|
+
(('note:PGroonga 3までにAPIをアップグレードすればよい'))
|
415
|
+
|
416
|
+
= Painless upgrade
|
417
|
+
|
418
|
+
* App for PGroonga 1.0.0 doesn't work with PGroonga 2\n
|
419
|
+
(('note:PGroonga 1.0.0用のアプリがPGroonga 2で動かない'))
|
420
|
+
* It's a bug. Please report it!\n
|
421
|
+
(('note:バグなので報告してね!'))
|
422
|
+
|
423
|
+
= FTS system: Basic features\n(('note:全文検索システム:基本機能'))
|
424
|
+
|
425
|
+
* Fast FTS + sort\n
|
426
|
+
(('note:高速全文検索+ソート'))
|
427
|
+
* Show texts around keyword\n
|
428
|
+
(('note:キーワード周辺テキスト表示'))
|
429
|
+
* Highlight keyword\n
|
430
|
+
(('note:検索キーワードハイライト'))
|
431
|
+
|
432
|
+
= Fast FTS + sort\n(('note:高速全文検索+ソート'))
|
433
|
+
|
434
|
+
# image
|
435
|
+
# src = images/php-document-search-search.png
|
436
|
+
# relative_height = 100
|
437
|
+
|
438
|
+
= Table definition
|
439
|
+
|
440
|
+
# coderay sql
|
441
|
+
|
442
|
+
CREATE TABLE entries (
|
443
|
+
-- Need primary key
|
444
|
+
-- It's needed for sort
|
445
|
+
id integer PRIMARY KEY,
|
446
|
+
title text,
|
447
|
+
content text
|
448
|
+
);
|
449
|
+
|
450
|
+
= Index definition
|
451
|
+
|
452
|
+
# coderay sql
|
453
|
+
|
454
|
+
-- For FTS.
|
455
|
+
-- The default is good enough!
|
456
|
+
CREATE INDEX entries_full_text_search
|
457
|
+
ON entries
|
458
|
+
-- "USING PGroonga" is important!
|
459
|
+
-- Primary key is for sort!
|
460
|
+
USING PGroonga (id, title, content);
|
461
|
+
|
462
|
+
= Insert data
|
463
|
+
|
464
|
+
# coderay sql
|
465
|
+
|
466
|
+
-- Normal INSERT.
|
467
|
+
INSERT INTO entries
|
468
|
+
VALUES (1,
|
469
|
+
'Fast FTS with Groonga!',
|
470
|
+
'Fast FTS is needed!');
|
471
|
+
|
472
|
+
= FTS
|
473
|
+
|
474
|
+
# coderay sql
|
475
|
+
|
476
|
+
SELECT title FROM entries
|
477
|
+
WHERE
|
478
|
+
-- &@~ is for FTS
|
479
|
+
-- AND search with "search" and "fast"
|
480
|
+
title &@~ 'search fast' OR
|
481
|
+
content &@~ 'search fast';
|
482
|
+
|
483
|
+
= FTS: LIKE
|
484
|
+
|
485
|
+
# coderay sql
|
486
|
+
|
487
|
+
SELECT title FROM entries
|
488
|
+
WHERE
|
489
|
+
-- Index search for LIKE is supported
|
490
|
+
-- = Improve app perf without any changes
|
491
|
+
-- NOTE: &@~ is faster than LIKE
|
492
|
+
title LIKE '%search%' OR
|
493
|
+
content LIKE '%search%';
|
494
|
+
|
495
|
+
= Sort
|
496
|
+
|
497
|
+
# coderay sql
|
498
|
+
|
499
|
+
SELECT
|
500
|
+
title,
|
501
|
+
-- pgroonga_score(TABLE_NAME) returns
|
502
|
+
-- precision as number
|
503
|
+
pgroonga_score(entries) AS score
|
504
|
+
FROM entries
|
505
|
+
WHERE -- ...
|
506
|
+
-- Sort by precision
|
507
|
+
ORDER BY score DESC LIMIT 10;
|
508
|
+
|
509
|
+
= Highlight keyword\n(('note:キーワードハイライト'))
|
510
|
+
|
511
|
+
# image
|
512
|
+
# src = images/php-document-search-search.png
|
513
|
+
# relative_height = 100
|
514
|
+
|
515
|
+
= Hightlight for HTML
|
516
|
+
|
517
|
+
# coderay sql
|
518
|
+
|
519
|
+
SELECT
|
520
|
+
pgroonga_highlight_html(
|
521
|
+
title,
|
522
|
+
-- Extract keywords from query
|
523
|
+
pgroonga_query_extract_keywords('search fast'))
|
524
|
+
FROM entries
|
525
|
+
WHERE title &@~ 'search fast' OR
|
526
|
+
content &@~ 'search fast';
|
527
|
+
|
528
|
+
= Highlight for HTML: Example
|
529
|
+
|
530
|
+
# coderay html
|
531
|
+
|
532
|
+
Fast search with <Groonga>!
|
533
|
+
↓
|
534
|
+
<span class="keyword">Fast</span>
|
535
|
+
↑↓ Keywords are marked up with "class"
|
536
|
+
<span class="keyword">search</span>!
|
537
|
+
with <Groonga>! ← Escape tag
|
538
|
+
|
539
|
+
= Texts around keyword\n(('note:キーワード周辺テキスト'))
|
540
|
+
|
541
|
+
# image
|
542
|
+
# src = images/php-document-search-search.png
|
543
|
+
# relative_height = 100
|
544
|
+
|
545
|
+
= Texts around keyword for HTML
|
546
|
+
|
547
|
+
# coderay sql
|
548
|
+
|
549
|
+
SELECT
|
550
|
+
pgroonga_snippet_html(
|
551
|
+
content,
|
552
|
+
-- Extract keywords from query
|
553
|
+
pgroonga_query_extract_keywords('search fast'))
|
554
|
+
FROM entries
|
555
|
+
WHERE title &@~ 'search fast' OR
|
556
|
+
content &@~ 'search fast';
|
557
|
+
|
558
|
+
= Example
|
559
|
+
|
560
|
+
# coderay html
|
561
|
+
|
562
|
+
...fast search with <Groonga>!...
|
563
|
+
↓
|
564
|
+
ARRAY[
|
565
|
+
↓ First
|
566
|
+
'<span class="keyword">fast</span>
|
567
|
+
↑↓ Keywords are marked up with "class"
|
568
|
+
<span class="keyword">search/span>!
|
569
|
+
with <Groonga>!', ← Escape tag
|
570
|
+
'...' ← Second
|
571
|
+
]
|
572
|
+
|
573
|
+
= FTS system: Adv. features\n(('note:全文検索システム:高度な機能'))
|
574
|
+
|
575
|
+
* Auto complete\n
|
576
|
+
(('note:オートコンプリート'))
|
577
|
+
* Similar search\n
|
578
|
+
(('note:類似文書検索'))
|
579
|
+
* Synonym expansion\n
|
580
|
+
(('note:同義語展開'))
|
581
|
+
|
582
|
+
= Auto complete\n(('note:オートコンプリート'))
|
583
|
+
|
584
|
+
# image
|
585
|
+
# src = images/php-document-search.png
|
586
|
+
# relative_height = 100
|
587
|
+
|
588
|
+
= Auto complete: Preparation\n(('note:オートコンプリート:準備'))
|
589
|
+
|
590
|
+
* Master table\n
|
591
|
+
(('note:マスターテーブル'))
|
592
|
+
* Candidate\n
|
593
|
+
(('note:候補:(例:牛乳)'))
|
594
|
+
* Readings in Katakana\n
|
595
|
+
(Only for Japanese)\n
|
596
|
+
(('note:ヨミ(日本語の場合。カタカナ。複数登録可。)'))
|
597
|
+
* (('note:例:ギュウニュウ・ミルク'))
|
598
|
+
|
599
|
+
= Auto complete: Implementation\n(('note:オートコンプリート:実装方法'))
|
600
|
+
|
601
|
+
* OR search with ...
|
602
|
+
* Prefix search against readings\n
|
603
|
+
(Only for Japanese)\n
|
604
|
+
(('note:ヨミを前方一致検索(日本語の場合。)'))
|
605
|
+
* Loose FTS against candidate\n
|
606
|
+
(('note:候補をゆるく全文検索'))
|
607
|
+
* Sort by candidate\n
|
608
|
+
(('note:候補でソート'))
|
609
|
+
|
610
|
+
(('tag:xx-small'))
|
611
|
+
((<"https://pgroonga.github.io/how-to/auto-complete.html"|URL:https://pgroonga.github.io/how-to/auto-complete.html>))
|
612
|
+
|
613
|
+
= Table definition
|
614
|
+
|
615
|
+
# coderay sql
|
616
|
+
|
617
|
+
CREATE TABLE terms (
|
618
|
+
term text, -- Candidate
|
619
|
+
readings text[], -- Readings
|
620
|
+
);
|
621
|
+
|
622
|
+
= Data example
|
623
|
+
|
624
|
+
# coderay sql
|
625
|
+
|
626
|
+
INSERT INTO terms VALUES (
|
627
|
+
'milk', -- Candidate
|
628
|
+
ARRAY[
|
629
|
+
-- Reading in Katakana
|
630
|
+
'ギュウニュウ', -- "milk" in Japanese
|
631
|
+
-- Multiple readings
|
632
|
+
'ミルク' -- "milk" in Katakana
|
633
|
+
]
|
634
|
+
);
|
635
|
+
|
636
|
+
= Data management\n(('note:データ管理'))
|
637
|
+
|
638
|
+
* Easy to maintain because it's a normal table\n
|
639
|
+
(('note:普通のテーブルなので管理が楽'))
|
640
|
+
* Easy to insert/delete/update\n
|
641
|
+
(('note:追加・削除・更新が楽'))
|
642
|
+
* Normal backup and replication\n
|
643
|
+
(('note:ダンプ・リストアもレプリケーションもいつも通り'))
|
644
|
+
|
645
|
+
= Index for prefix search\n(('note:前方一致用インデックス'))
|
646
|
+
|
647
|
+
# coderay sql
|
648
|
+
|
649
|
+
CREATE INDEX prefix_search ON terms
|
650
|
+
USING PGroonga
|
651
|
+
-- ...text_array_term_search...
|
652
|
+
(readings
|
653
|
+
pgroonga_text_array_term_search_ops_v2);
|
654
|
+
|
655
|
+
= Index for loose FTS\n(('note:緩い全文検索用インデックス'))
|
656
|
+
|
657
|
+
# coderay sql
|
658
|
+
|
659
|
+
CREATE INDEX loose_search ON terms
|
660
|
+
USING PGroonga (term)
|
661
|
+
-- Tokenizer for loose full text search
|
662
|
+
WITH (tokenizer='TokenBigramSplitSymbolAlphaDigit');
|
663
|
+
|
664
|
+
= How to search\n(('note:検索方法'))
|
665
|
+
|
666
|
+
# coderay sql
|
667
|
+
|
668
|
+
SELECT term FROM terms
|
669
|
+
-- Prefix search against readings
|
670
|
+
WHERE readings &^~ '${INPUT}' OR
|
671
|
+
-- Loose full text search
|
672
|
+
term &@ '${INPUT}'
|
673
|
+
ORDER BY term LIMIT 10; -- Sort
|
674
|
+
|
675
|
+
= Search example: Candidate\n(('note:検索例:候補'))
|
676
|
+
|
677
|
+
# coderay sql
|
678
|
+
|
679
|
+
-- User inputs "il"
|
680
|
+
SELECT term FROM terms
|
681
|
+
-- Prefix search against readings
|
682
|
+
WHERE readings &^~ 'il' OR
|
683
|
+
-- Loose full text search (Hit)
|
684
|
+
term &@ 'li'
|
685
|
+
ORDER BY term LIMIT 10; -- Sort
|
686
|
+
|
687
|
+
= Search example: Katakana\n(('note:検索例:カタカナ'))
|
688
|
+
|
689
|
+
# coderay sql
|
690
|
+
|
691
|
+
-- User inputs "ギュウ"
|
692
|
+
SELECT term FROM terms
|
693
|
+
-- Prefix search against readings (Hit)
|
694
|
+
WHERE readings &^~ 'ギュウ' OR
|
695
|
+
-- Loose full text search
|
696
|
+
term &@ 'ギュウ'
|
697
|
+
ORDER BY term LIMIT 10; -- Sort
|
698
|
+
|
699
|
+
= Search example: Hiragana\n(('note:検索例:ひらがな'))
|
700
|
+
|
701
|
+
# coderay sql
|
702
|
+
|
703
|
+
-- User inputs "ぎゅう"
|
704
|
+
SELECT term FROM terms
|
705
|
+
-- Prefix search against readings (Hit)
|
706
|
+
WHERE readings &^~ 'ぎゅう' OR
|
707
|
+
-- Loose full text search
|
708
|
+
term &@ 'ぎゅう'
|
709
|
+
ORDER BY term LIMIT 10; -- Sort
|
710
|
+
|
711
|
+
= Search example: Romaji\n(('note:検索例:ローマ字'))
|
712
|
+
|
713
|
+
# coderay sql
|
714
|
+
|
715
|
+
-- User inputs "gyu"
|
716
|
+
SELECT term FROM terms
|
717
|
+
-- Prefix search against readings (Hit)
|
718
|
+
WHERE readings &^~ 'gyu' OR
|
719
|
+
-- Loose full text search
|
720
|
+
term &@ 'gyu'
|
721
|
+
ORDER BY term LIMIT 10; -- Sort
|
722
|
+
|
723
|
+
= Synonym expansion\n(('note:同義語展開'))
|
724
|
+
|
725
|
+
* Synonym\n
|
726
|
+
(('note:同義語'))
|
727
|
+
* Same mean but different notation\n
|
728
|
+
(('note:同じ意味だが表記が異なる語'))
|
729
|
+
* e.g.: "PostgreSQL" and "PG"\n
|
730
|
+
(('note:例:「PostgreSQL」と「ポスグレ」'))
|
731
|
+
|
732
|
+
= Synonym expansion\n(('note:同義語展開'))
|
733
|
+
|
734
|
+
* Users don't want to care\n
|
735
|
+
(('note:ユーザーは気にしたくない'))
|
736
|
+
* Synonym expansion\n
|
737
|
+
(('note:同義語展開'))
|
738
|
+
* OR search with all synonyms\n
|
739
|
+
(('note:同義語すべてでOR検索'))
|
740
|
+
|
741
|
+
= Implementation\n(('note:実装方法'))
|
742
|
+
|
743
|
+
* Create synonym table\n
|
744
|
+
(('note:同義語管理テーブルを作成'))
|
745
|
+
* Expand synonyms in query\n
|
746
|
+
(('note:クエリー内の同義語を展開'))
|
747
|
+
* Search by expanded query\n
|
748
|
+
(('note:展開後のクエリーで検索'))
|
749
|
+
|
750
|
+
(('tag:xx-small'))
|
751
|
+
((<"https://pgroonga.github.io/reference/functions/pgroonga-query-expand.html"|URL:https://pgroonga.github.io/reference/functions/pgroonga-query-expand.html>))
|
752
|
+
|
753
|
+
= Table definition
|
754
|
+
|
755
|
+
# coderay sql
|
756
|
+
CREATE TABLE synonyms (
|
757
|
+
-- Term to be expanded
|
758
|
+
term text,
|
759
|
+
-- Synonym list.
|
760
|
+
-- Including the "term" itself.
|
761
|
+
-- If you don't input the "term",
|
762
|
+
-- the "term" is unsearchable term.
|
763
|
+
terms text[]
|
764
|
+
);
|
765
|
+
|
766
|
+
= Data example
|
767
|
+
|
768
|
+
# coderay sql
|
769
|
+
INSERT INTO synonyms
|
770
|
+
VALUES ('PostgreSQL', -- Expand "PostgreSQL"
|
771
|
+
ARRAY['PostgreSQL', 'PG']),
|
772
|
+
('PG', -- Expand "PG"
|
773
|
+
ARRAY['PG', 'PostgreSQL']);
|
774
|
+
|
775
|
+
= Data management\n(('note:データ管理'))
|
776
|
+
|
777
|
+
* Easy to maintain because it's a normal table\n
|
778
|
+
(('note:普通のテーブルなので管理が楽'))
|
779
|
+
* Easy to insert/delete/update\n
|
780
|
+
(('note:追加・削除・更新が楽'))
|
781
|
+
* Normal backup and replication\n
|
782
|
+
(('note:ダンプ・リストアもレプリケーションもいつも通り'))
|
783
|
+
|
784
|
+
= Index definition
|
785
|
+
|
786
|
+
# coderay sql
|
787
|
+
CREATE INDEX synonym_search ON synonyms
|
788
|
+
USING PGroonga
|
789
|
+
-- ...text_term_search...
|
790
|
+
-- For equal search
|
791
|
+
(term pgroonga_text_term_search_ops_v2);
|
792
|
+
|
793
|
+
= Confirm\n(('note:確認方法'))
|
794
|
+
|
795
|
+
# coderay sql
|
796
|
+
|
797
|
+
SELECT pgroonga_query_expand(
|
798
|
+
'synonyms', -- Table name
|
799
|
+
'term', -- Column name to be expanded
|
800
|
+
'terms', -- Column name for synonyms
|
801
|
+
'PostgreSQL' -- Query
|
802
|
+
);
|
803
|
+
-- '((PostgreSQL) OR (PG))'
|
804
|
+
|
805
|
+
= Search\n(('note:検索方法'))
|
806
|
+
|
807
|
+
# coderay sql
|
808
|
+
SELECT title FROM entries
|
809
|
+
WHERE
|
810
|
+
-- title &@~ 'DB ((PostgreSQL) OR (PG))'
|
811
|
+
title &@~
|
812
|
+
pgroonga_query_expand('synonyms',
|
813
|
+
'term',
|
814
|
+
'terms',
|
815
|
+
'DB PostgreSQL');
|
816
|
+
|
817
|
+
= Similar search\n(('note:類似文書検索'))
|
818
|
+
|
819
|
+
* Query is document itself\n
|
820
|
+
(('note:検索クエリーは文書そのもの'))
|
821
|
+
* Not keyword\n
|
822
|
+
(('note:キーワードではない'))
|
823
|
+
* Use case\n
|
824
|
+
(('note:利用例'))
|
825
|
+
* Show related entries\n
|
826
|
+
(('note:関連エントリーの提示に使える'))
|
827
|
+
|
828
|
+
= Implementation\n(('note:実現方法'))
|
829
|
+
|
830
|
+
* Create dedicated index\n
|
831
|
+
(('note:類似検索用のインデックスを作る'))
|
832
|
+
* Use tokenizer for target language\n
|
833
|
+
(('note:対象の言語に合わせた処理で精度向上'))
|
834
|
+
* e.g.: MeCab based tokenizer for Japanese\n
|
835
|
+
(('note:例:日本語ならMeCabベースのトークナイザーを活用'))
|
836
|
+
* Use dedicated operator\n
|
837
|
+
(('note:類似検索用の演算子を使う'))
|
838
|
+
|
839
|
+
= Index definition
|
840
|
+
|
841
|
+
# coderay sql
|
842
|
+
|
843
|
+
CREATE INDEX entries_similar_search
|
844
|
+
ON entries
|
845
|
+
-- Target: Both title and content
|
846
|
+
-- Reason: Title is important
|
847
|
+
USING PGroonga (id, (title || ' ' || content))
|
848
|
+
-- TokenMecab is good for Japanese
|
849
|
+
WITH (tokenizer='TokenMecab');
|
850
|
+
|
851
|
+
= Search
|
852
|
+
|
853
|
+
# coderay sql
|
854
|
+
|
855
|
+
SELECT title,
|
856
|
+
pgroonga_score(entries) AS score
|
857
|
+
FROM entries
|
858
|
+
WHERE
|
859
|
+
-- &@* is operator for similar search.
|
860
|
+
-- Search with existing document.
|
861
|
+
(title || ' ' || content) &@*
|
862
|
+
'...fast search with Groonga!...'
|
863
|
+
ORDER BY score DESC LIMIT 3;
|
864
|
+
|
865
|
+
= Result example\n(('note:結果例'))
|
866
|
+
|
867
|
+
Query:
|
868
|
+
...search with Groonga!...
|
869
|
+
|
870
|
+
Hit example:
|
871
|
+
...search with PGroonga!...
|
872
|
+
|
873
|
+
= Wrap up: Basic features\n(('note:全文検索システム:基本機能'))
|
874
|
+
|
875
|
+
* Fast FTS + sort\n
|
876
|
+
(('note:高速全文検索+ソート'))
|
877
|
+
* Show texts around keyword\n
|
878
|
+
(('note:キーワード周辺テキスト表示'))
|
879
|
+
* Highlight keyword\n
|
880
|
+
(('note:検索キーワードハイライト'))
|
881
|
+
|
882
|
+
= Wrap up: Adv. features\n(('note:全文検索システム:高度な機能'))
|
883
|
+
|
884
|
+
* Auto complete\n
|
885
|
+
(('note:オートコンプリート'))
|
886
|
+
* Similar search\n
|
887
|
+
(('note:類似文書検索'))
|
888
|
+
* Synonym expansion\n
|
889
|
+
(('note:同義語展開'))
|
890
|
+
|
891
|
+
= FTS system: Next step\n(('note:全文検索システム:次の一歩'))
|
892
|
+
|
893
|
+
* Support structured data\n
|
894
|
+
(('note:構造化データ対応'))
|
895
|
+
* Office document, HTML, ...\n
|
896
|
+
(('note:オフィス文書・HTMLなど'))
|
897
|
+
* Needed features\n
|
898
|
+
(('note:対応に必要な処理'))
|
899
|
+
* Text/metadata extraction\n
|
900
|
+
(('note:テキスト・メタデータ抽出'))
|
901
|
+
* Create screenshot\n
|
902
|
+
(('note:スクリーンショット作成'))
|
903
|
+
|
904
|
+
= Extraction tool\n(('note:抽出ツール'))
|
905
|
+
|
906
|
+
* Apache Tika
|
907
|
+
* Apache Lucene's subproject
|
908
|
+
* Many supported formats\n
|
909
|
+
(('note:対応フォーマットが多い'))
|
910
|
+
* ChupaText
|
911
|
+
* Groonga's subproject
|
912
|
+
* Screenshot support\n
|
913
|
+
(('note:スクリーンショット作成対応'))
|
914
|
+
|
915
|
+
= ChupaText
|
916
|
+
|
917
|
+
* Supported formats(('note:(対応フォーマット)'))
|
918
|
+
* Word/Excel/PowerPoint
|
919
|
+
* ODT/ODS/ODP(('note:(OpenDocument)'))
|
920
|
+
* PDF/HTML/XML/CSV/...
|
921
|
+
* Interface(('note:(インターフェイス)'))
|
922
|
+
* HTTP and command line\n
|
923
|
+
(('note:HTTPとコマンドライン'))
|
924
|
+
|
925
|
+
= Install\n(('note:インストール'))
|
926
|
+
|
927
|
+
* Use Docker or Vagrant\n
|
928
|
+
(('note:DockerかVagrantを使うのが楽'))
|
929
|
+
* (('tag:xx-small'))
|
930
|
+
((<"https://github.com/ranguba/chupa-text-docker"|URL:https://github.com/ranguba/chupa-text-docker>))
|
931
|
+
* (('tag:xx-small'))
|
932
|
+
((<"https://github.com/ranguba/chupa-text-vagrant"|URL:https://github.com/ranguba/chupa-text-vagrant>))
|
933
|
+
|
934
|
+
= ChupaText:Docker
|
935
|
+
|
936
|
+
# coderay console
|
937
|
+
% GITHUB=https://github.com
|
938
|
+
% git clone \
|
939
|
+
${GITHUB}/ranguba/chupa-text-docker.git
|
940
|
+
% cd chupa-text-docker
|
941
|
+
% docker-compose up --build
|
942
|
+
|
943
|
+
= Usage\n(('note:使い方'))
|
944
|
+
|
945
|
+
# coderay console
|
946
|
+
% curl \
|
947
|
+
--form data=@XXX.pdf \
|
948
|
+
http://localhost:20080/extraction.json
|
949
|
+
|
950
|
+
= Result example\n(('note:結果例'))
|
951
|
+
|
952
|
+
# coderay json
|
953
|
+
|
954
|
+
{
|
955
|
+
"mime-type": "application/pdf", # MIME type for the original data
|
956
|
+
"size": 147159, # Metadata
|
957
|
+
...,
|
958
|
+
"texts": [ # Extracted texts
|
959
|
+
{
|
960
|
+
"mime-type": "text/plain", # MIME type for the extracted data
|
961
|
+
...,
|
962
|
+
"creator": "Adobe Illustrator CS3", # Metadata
|
963
|
+
"body": "This is sample PDF. ...", # Extracted text
|
964
|
+
"screenshot": {
|
965
|
+
"mime-type": "image/png", # MIME type for screenshot
|
966
|
+
"data": "iVBORw...", # Base64-ed image data
|
967
|
+
"encoding": "base64"
|
968
|
+
}
|
969
|
+
}
|
970
|
+
]
|
971
|
+
}
|
972
|
+
|
973
|
+
= Web UI
|
974
|
+
|
975
|
+
# image
|
976
|
+
# src = images/chupa-text-web-ui-form.png
|
977
|
+
# relative_height = 100
|
978
|
+
|
979
|
+
= Web UI: Extraction example\n(('note:Web UI:抽出例'))
|
980
|
+
|
981
|
+
# image
|
982
|
+
# src = images/chupa-text-web-ui-extract-metadata.png
|
983
|
+
# relative_height = 100
|
984
|
+
|
985
|
+
= Web UI: Extraction example\n(('note:Web UI:抽出例'))
|
986
|
+
|
987
|
+
# image
|
988
|
+
# src = images/chupa-text-web-ui-extract-text-and-screenshot.png
|
989
|
+
# relative_height = 100
|
990
|
+
|
991
|
+
= ChupaText:Vagrant
|
992
|
+
|
993
|
+
# coderay console
|
994
|
+
% GITHUB=https://github.com
|
995
|
+
% git clone \
|
996
|
+
${GITHUB}/ranguba/chupa-text-vagrant.git
|
997
|
+
% cd chupa-text-vagrant
|
998
|
+
% vagrant up
|
999
|
+
|
1000
|
+
(('tag:center'))
|
1001
|
+
Usage is the same as Docker's\n
|
1002
|
+
(('note:使い方はDocker版と同じ'))
|
1003
|
+
|
1004
|
+
= Use cases(('note:(活用例)'))
|
1005
|
+
|
1006
|
+
* Extracted text
|
1007
|
+
* Insert into PGroonga
|
1008
|
+
* Extracted metadata
|
1009
|
+
* Insert into PGroonga
|
1010
|
+
* Use for condition(('note:(絞り込みに活用)'))
|
1011
|
+
* Created screenshot
|
1012
|
+
* Show in search result(('note:(検索結果で表示)'))
|
1013
|
+
|
1014
|
+
= Wrap up\n(('note:まとめ'))
|
1015
|
+
|
1016
|
+
* FTS engine via PostgreSQL\n
|
1017
|
+
(('note:PostgreSQL経由で全文検索エンジン'))
|
1018
|
+
* Provide decision info\n
|
1019
|
+
(('note:採用の判断材料を提供'))
|
1020
|
+
|
1021
|
+
= Wrap up\n(('note:まとめ'))
|
1022
|
+
|
1023
|
+
* Show how to impl. FTS system\n
|
1024
|
+
(('note:全文検索システム実装例を紹介'))
|
1025
|
+
* PGroonga
|
1026
|
+
* PGroonga 1.0.0 and 2\n
|
1027
|
+
(('note:PGroonga 1.0.0と2'))
|
1028
|
+
* Painless upgrade
|
1029
|
+
|
1030
|
+
= Wrap up\n(('note:まとめ'))
|
1031
|
+
|
1032
|
+
* Show how to support structured data\n
|
1033
|
+
(('note:構造化データの対応方法を紹介'))
|
1034
|
+
* ChupaText
|
1035
|
+
|
1036
|
+
= Support service\n(('note:サポートサービス紹介'))
|
1037
|
+
|
1038
|
+
* Install support(('note:(導入支援)'))\n
|
1039
|
+
(('note:設計支援・性能検証・移行支援・…'))
|
1040
|
+
* Development support(('note:(開発支援)'))\n
|
1041
|
+
(('note:サンプルコード提供・問い合わせ対応・…'))
|
1042
|
+
* Operation support(('note:(運用支援)'))\n
|
1043
|
+
(('note:障害対応・チューニング支援・…'))
|
1044
|
+
|
1045
|
+
Contact(('note:(問い合わせ先)'))
|
1046
|
+
|
1047
|
+
(('tag:x-small'))
|
1048
|
+
((<"https://www.clear-code.com/contact/?type=groonga"|URL:https://www.clear-code.com/contact/?type=groonga>))
|
data/theme.rb
ADDED
metadata
ADDED
@@ -0,0 +1,88 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: rabbit-slide-kou-pgconf-asia-2017
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 2017.12.5.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Kouhei Sutou
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2017-12-05 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: rabbit
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - ">="
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: 2.0.2
|
20
|
+
type: :runtime
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - ">="
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: 2.0.2
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rabbit-theme-groonga
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - ">="
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '0'
|
34
|
+
type: :runtime
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - ">="
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '0'
|
41
|
+
description: |-
|
42
|
+
PGroonga 2.0 has been released with 2 years development since PGroonga 1.0.0. PGroonga 1.0.0 just provides fast full text search with all languages support. It's important because it's a lacked feature in PostgreSQL. PGroonga 2.0 provides more useful features to implement rich full text search system with PostgreSQL. This session shows how to implement rich full text search system with PostgreSQL!
|
43
|
+
|
44
|
+
This talk describes about PGroonga that resolves these problems.
|
45
|
+
email:
|
46
|
+
- kou@clear-code.com
|
47
|
+
executables: []
|
48
|
+
extensions: []
|
49
|
+
extra_rdoc_files: []
|
50
|
+
files:
|
51
|
+
- ".rabbit"
|
52
|
+
- README.rd
|
53
|
+
- Rakefile
|
54
|
+
- config.yaml
|
55
|
+
- images/chupa-text-web-ui-extract-metadata.png
|
56
|
+
- images/chupa-text-web-ui-extract-text-and-screenshot.png
|
57
|
+
- images/chupa-text-web-ui-form.png
|
58
|
+
- images/php-document-search-search.png
|
59
|
+
- images/php-document-search.png
|
60
|
+
- pdf/pgconf-asia-2017-pgroonga-2.pdf
|
61
|
+
- pgroonga-2.rab
|
62
|
+
- theme.rb
|
63
|
+
homepage: http://slide.rabbit-shocker.org/authors/kou/pgconf-asia-2017/
|
64
|
+
licenses:
|
65
|
+
- CC-BY-SA-4.0
|
66
|
+
- CC-BY-3.0
|
67
|
+
metadata: {}
|
68
|
+
post_install_message:
|
69
|
+
rdoc_options: []
|
70
|
+
require_paths:
|
71
|
+
- lib
|
72
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
73
|
+
requirements:
|
74
|
+
- - ">="
|
75
|
+
- !ruby/object:Gem::Version
|
76
|
+
version: '0'
|
77
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
78
|
+
requirements:
|
79
|
+
- - ">="
|
80
|
+
- !ruby/object:Gem::Version
|
81
|
+
version: '0'
|
82
|
+
requirements: []
|
83
|
+
rubyforge_project:
|
84
|
+
rubygems_version: 2.5.2.1
|
85
|
+
signing_key:
|
86
|
+
specification_version: 4
|
87
|
+
summary: PGroonga 2 – Make PostgreSQL rich full text search system backend!
|
88
|
+
test_files: []
|