bunpa 0.2.0 → 0.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +30 -4
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 7ea90d306e49f6d71e91451258f5a038b0d2e2c7
|
4
|
+
data.tar.gz: e13a574fff50ab602883e5a80259277de8d82119
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4d3840324ee4a6b878ada12573540c5b21afe0ffdc4b799941627b32a6365ee0f01a517ed51405cd3946b9272861f1050e5d94c892efa1c1eceb84a2d5027d43
|
7
|
+
data.tar.gz: 342169b076fe3d09b415c977b5a5033e172d3ccac0ed01a1fc5c58825b62d85b963025c22d9dcb6afa42a5ac90abcf38dfe024d3c669b4b88625d2c9cbdcedb9
|
data/README.md
CHANGED
@@ -4,7 +4,7 @@ Bunpa
|
|
4
4
|
Bunpa is an extremely simple wrapper around the MeCab Japanese grammar parser. It was designed with two key features in mind:
|
5
5
|
|
6
6
|
1. Simplicity - only returns the text and major part of speech for each component
|
7
|
-
2. Completeness -
|
7
|
+
2. Completeness - ensures that whitespace and any unknown characters are preserved
|
8
8
|
|
9
9
|
## Background
|
10
10
|
|
@@ -19,7 +19,7 @@ Any components that cannot be identified by either MeCab or Bunpa are marked as
|
|
19
19
|
|
20
20
|
## Installation
|
21
21
|
|
22
|
-
From within your
|
22
|
+
From within your application's base directory:
|
23
23
|
|
24
24
|
1. Edit your Gemfile and add:
|
25
25
|
|
@@ -31,7 +31,7 @@ From within your Rails application's base directory:
|
|
31
31
|
|
32
32
|
## Usage
|
33
33
|
|
34
|
-
Bunpa operates as a very simple parser. It returns the components it identifies
|
34
|
+
Bunpa operates as a very simple parser. It returns the components it identifies as an Array of Bunpa::Text::Component objects, in the same order as they appear in the document. Each Component object has two accessors - 'text' and 'kind', which return the text value and part of speech of the component respectively.
|
35
35
|
|
36
36
|
Basic usage is as follows:
|
37
37
|
|
@@ -42,12 +42,38 @@ require 'bunpa'
|
|
42
42
|
parser = Bunpa::JapaneseTextParser.new
|
43
43
|
|
44
44
|
# Get an enumerable of Bunpa::Text::Components
|
45
|
-
components = parser.parse("
|
45
|
+
components = parser.parse("A: こんにちは! お元気ですか。\nB: はい、元気です!")
|
46
46
|
|
47
47
|
components.each do |component|
|
48
48
|
puts "#{component.text}\t(#{component.kind}"
|
49
49
|
end
|
50
|
+
```
|
51
|
+
|
52
|
+
This would output:
|
50
53
|
|
54
|
+
```
|
55
|
+
A (名詞)
|
56
|
+
: (名詞)
|
57
|
+
(スペース)
|
58
|
+
こんにちは (感動詞)
|
59
|
+
! (記号)
|
60
|
+
(スペース)
|
61
|
+
お (接頭詞)
|
62
|
+
元気 (名詞)
|
63
|
+
です (助動詞)
|
64
|
+
か (助詞)
|
65
|
+
。 (記号)
|
66
|
+
|
67
|
+
(改行)
|
68
|
+
B (名詞)
|
69
|
+
: (名詞)
|
70
|
+
(スペース)
|
71
|
+
は (助詞)
|
72
|
+
い (動詞)
|
73
|
+
、 (記号)
|
74
|
+
元気 (名詞)
|
75
|
+
です (助動詞)
|
76
|
+
! (記号)
|
51
77
|
```
|
52
78
|
|
53
79
|
For a slightly more detailed example, see the `usage_example.rb` script in the `bin` directory.
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bunpa
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Daniel Carter
|
@@ -71,7 +71,7 @@ rubyforge_project:
|
|
71
71
|
rubygems_version: 2.3.0
|
72
72
|
signing_key:
|
73
73
|
specification_version: 4
|
74
|
-
summary: bunpa v0.
|
74
|
+
summary: bunpa v0.3.0
|
75
75
|
test_files:
|
76
76
|
- spec/spec_helper.rb
|
77
77
|
- spec/parser_spec.rb
|