cseg 0.1.1 → 0.1.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 39c56cb55b77d2ae524a9b46ff432dddc73dddce
4
- data.tar.gz: 7fd1638ad503c752d96b3184198a90ba155ea00b
3
+ metadata.gz: 50e095f936a44f0a92394260ad6f82b6e0b3391d
4
+ data.tar.gz: ce17fc8ecaa4d4f0b15566ebf5225b2c312d8940
5
5
  SHA512:
6
- metadata.gz: fc8e1aa7f18bb930ed98f215b7e5aa7147cc6d66ba802dd76f1f6be7c6dbd74d7ece2931921ff866f250f16acca81d49e4fa6b16c8e4fe8015ba11404353f201
7
- data.tar.gz: 03be8396a2c3afa2a4299cd54e120aeb27a453ab54a7f805f254650f24ae17709b772b7f767a6cfab86120651c4c0c61d95b00c8c549865fa67af2dbca9470c3
6
+ metadata.gz: 284d67a876fa1d382801e1801dd0f2813f39ecf9a5753da7308e026646bb3bf036d24d811c75e69df1234b52b31c8d16c316eea11b2c60d95f85c51e79dc4962
7
+ data.tar.gz: d336a86d2239e2126360d8f5ee27328892c7659d2822c58d135b736d2f38ebdfc0ad7e0abff37aae23fc7dbba85136d6afdcb628f62070dba7b61259f6f99ac4
data/.gitignore CHANGED
@@ -16,3 +16,4 @@ test/tmp
16
16
  test/version_tmp
17
17
  tmp
18
18
  data/
19
+ .data
data/README.md CHANGED
@@ -1,10 +1,11 @@
1
- # Kurumi
1
+ # Cseg(Kurumi)
2
2
 
3
3
  Use MIRA to train a large amount of features.
4
4
 
5
- Segment chinese(both traditional and simplized) sentences into words in high speed and correctly.
5
+ Segment chinese(both traditional and simplified) sentences into words in high speed and correctly.
6
6
 
7
7
  take care the name of the gem is different from the repo name!
8
+
8
9
  ## Installation
9
10
 
10
11
  Add this line to your application's Gemfile:
@@ -23,7 +24,7 @@ you need to install [CRF++](http://crfpp.googlecode.com/svn/trunk/doc/index.html
23
24
 
24
25
  On github the dictionary file was deleted since it is quite large, though you can get all from rubygems.
25
26
 
26
- ## recall and Precision
27
+ ## Recall and Precision
27
28
 
28
29
  Tested on seghanbakeoff pku test set
29
30
 
@@ -33,15 +34,14 @@ Recall: 92.86%
33
34
 
34
35
  ## Usage
35
36
 
36
- ```
37
- The default is Simplified Chinese
38
- require "cseg"
39
- a=Kurumi.segment("屌丝是一种自我讽刺。")
40
- =>["屌丝", "是", "一", "种", "自我", "讽刺", "。"]
41
- Use parameter "tr" to specify Traditional Chinese
42
- a=Kurumi.segment("台妹真是正點。","tr")
43
- =>["台妹", "真", "是", "正點", "。"]
37
+ ```ruby
38
+ #The default is Simplified Chinese
39
+ require "cseg"
40
+ Kurumi.segment("屌丝是一种自我讽刺。")
41
+ #=>["屌丝", "是", "一", "种", "自我", "讽刺", "。"]
42
+ #Use parameter "tr" to specify Traditional Chinese
43
+ Kurumi.segment("台妹真是正點。","tr")
44
+ #=>["台妹", "真", "是", "正點", "。"]
44
45
 
45
46
  ```
46
47
 
47
- ## Contributing
@@ -16,7 +16,7 @@ Gem::Specification.new do |gem|
16
16
  "LICENSE.txt",
17
17
  "README.md",
18
18
  "Gemfile",
19
- "data/pkumodle.data",
19
+ "data/pku_training.data",
20
20
  "data/as_training_less.data",
21
21
  "lib/cseg/version.rb",
22
22
  "lib/cseg.rb",
@@ -2,7 +2,7 @@
2
2
  require "cseg/version"
3
3
  class Kurumi
4
4
  # since crf++ can only read from file
5
- @modle_sp=File.expand_path("../../data/as_training.data", __FILE__)
5
+ @modle_sp=File.expand_path("../../data/pku_training.data", __FILE__)
6
6
  @modle_tr=File.expand_path("../../data/as_training_less.data", __FILE__)
7
7
  def self.segment(str, mode="sp")
8
8
  @result=Array.new
@@ -1,3 +1,3 @@
1
1
  module Cseg
2
- VERSION = "0.1.1"
2
+ VERSION = "0.1.2"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: cseg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - gyorou
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-09-08 00:00:00.000000000 Z
11
+ date: 2014-11-21 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: '"a chinese segmentation tool using CRF"'
14
14
  email:
@@ -24,7 +24,7 @@ files:
24
24
  - Rakefile
25
25
  - cseg.gemspec
26
26
  - data/as_training_less.data
27
- - data/pkumodle.data
27
+ - data/pku_training.data
28
28
  - lib/cseg.rb
29
29
  - lib/cseg/version.rb
30
30
  homepage: ''