cseg 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 39c56cb55b77d2ae524a9b46ff432dddc73dddce
4
- data.tar.gz: 7fd1638ad503c752d96b3184198a90ba155ea00b
3
+ metadata.gz: 50e095f936a44f0a92394260ad6f82b6e0b3391d
4
+ data.tar.gz: ce17fc8ecaa4d4f0b15566ebf5225b2c312d8940
5
5
  SHA512:
6
- metadata.gz: fc8e1aa7f18bb930ed98f215b7e5aa7147cc6d66ba802dd76f1f6be7c6dbd74d7ece2931921ff866f250f16acca81d49e4fa6b16c8e4fe8015ba11404353f201
7
- data.tar.gz: 03be8396a2c3afa2a4299cd54e120aeb27a453ab54a7f805f254650f24ae17709b772b7f767a6cfab86120651c4c0c61d95b00c8c549865fa67af2dbca9470c3
6
+ metadata.gz: 284d67a876fa1d382801e1801dd0f2813f39ecf9a5753da7308e026646bb3bf036d24d811c75e69df1234b52b31c8d16c316eea11b2c60d95f85c51e79dc4962
7
+ data.tar.gz: d336a86d2239e2126360d8f5ee27328892c7659d2822c58d135b736d2f38ebdfc0ad7e0abff37aae23fc7dbba85136d6afdcb628f62070dba7b61259f6f99ac4
data/.gitignore CHANGED
@@ -16,3 +16,4 @@ test/tmp
16
16
  test/version_tmp
17
17
  tmp
18
18
  data/
19
+ .data
data/README.md CHANGED
@@ -1,10 +1,11 @@
1
- # Kurumi
1
+ # Cseg(Kurumi)
2
2
 
3
3
  Use MIRA to train a large amount of features.
4
4
 
5
- Segment chinese(both traditional and simplized) sentences into words in high speed and correctly.
5
+ Segment chinese(both traditional and simplified) sentences into words in high speed and correctly.
6
6
 
7
7
  take care the name of the gem is different from the repo name!
8
+
8
9
  ## Installation
9
10
 
10
11
  Add this line to your application's Gemfile:
@@ -23,7 +24,7 @@ you need to install [CRF++](http://crfpp.googlecode.com/svn/trunk/doc/index.html
23
24
 
24
25
  On github the dictionary file was deleted since it is quite large, though you can get all from rubygems.
25
26
 
26
- ## recall and Precision
27
+ ## Recall and Precision
27
28
 
28
29
  Tested on seghanbakeoff pku test set
29
30
 
@@ -33,15 +34,14 @@ Recall: 92.86%
33
34
 
34
35
  ## Usage
35
36
 
36
- ```
37
- The default is Simplified Chinese
38
- require "cseg"
39
- a=Kurumi.segment("屌丝是一种自我讽刺。")
40
- =>["屌丝", "是", "一", "种", "自我", "讽刺", "。"]
41
- Use parameter "tr" to specify Traditional Chinese
42
- a=Kurumi.segment("台妹真是正點。","tr")
43
- =>["台妹", "真", "是", "正點", "。"]
37
+ ```ruby
38
+ #The default is Simplified Chinese
39
+ require "cseg"
40
+ Kurumi.segment("屌丝是一种自我讽刺。")
41
+ #=>["屌丝", "是", "一", "种", "自我", "讽刺", "。"]
42
+ #Use parameter "tr" to specify Traditional Chinese
43
+ Kurumi.segment("台妹真是正點。","tr")
44
+ #=>["台妹", "真", "是", "正點", "。"]
44
45
 
45
46
  ```
46
47
 
47
- ## Contributing
@@ -16,7 +16,7 @@ Gem::Specification.new do |gem|
16
16
  "LICENSE.txt",
17
17
  "README.md",
18
18
  "Gemfile",
19
- "data/pkumodle.data",
19
+ "data/pku_training.data",
20
20
  "data/as_training_less.data",
21
21
  "lib/cseg/version.rb",
22
22
  "lib/cseg.rb",
@@ -2,7 +2,7 @@
2
2
  require "cseg/version"
3
3
  class Kurumi
4
4
  # since crf++ can only read from file
5
- @modle_sp=File.expand_path("../../data/as_training.data", __FILE__)
5
+ @modle_sp=File.expand_path("../../data/pku_training.data", __FILE__)
6
6
  @modle_tr=File.expand_path("../../data/as_training_less.data", __FILE__)
7
7
  def self.segment(str, mode="sp")
8
8
  @result=Array.new
@@ -1,3 +1,3 @@
1
1
  module Cseg
2
- VERSION = "0.1.1"
2
+ VERSION = "0.1.2"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: cseg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - gyorou
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-09-08 00:00:00.000000000 Z
11
+ date: 2014-11-21 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: '"a chinese segmentation tool using CRF"'
14
14
  email:
@@ -24,7 +24,7 @@ files:
24
24
  - Rakefile
25
25
  - cseg.gemspec
26
26
  - data/as_training_less.data
27
- - data/pkumodle.data
27
+ - data/pku_training.data
28
28
  - lib/cseg.rb
29
29
  - lib/cseg/version.rb
30
30
  homepage: ''