RubyGems - rmmseg - Versions diffs - 0.1.4 → 0.1.5 - Mend

rmmseg 0.1.4 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

data/History.txt CHANGED Viewed

@@ -1,3 +1,7 @@
+=== 0.1.5 / 2008-03-03
+* Bug fix: Ferret Token is not Duck-Typing. We need to construct Ferret token instead of reuse RMMSeg Token.
 === 0.1.4 / 2008-03-02
 * Let user store their customized word to Dictionary after loaded.

data/README.txt CHANGED Viewed

@@ -10,7 +10,7 @@ algorithms. Two algorithms are available for using:
 * simple algorithm that uses only forward maximum matching.
 * complex algorithm that uses three-word chunk maximum matching and 3
-  aditonal rules to solve ambiguities.
+  additonal rules to solve ambiguities.
 For more information about the algorithm, please refer to the
 following essays:

data/TODO.txt CHANGED Viewed

@@ -1,4 +1,5 @@
 === TODO
+* Add mock test for RMMSeg::Ferret.
 * Avoid Memory Leak
 * Improve Performance

data/lib/rmmseg/ferret.rb CHANGED Viewed

@@ -39,7 +39,11 @@ module RMMSeg
       # Get next token
       def next
-        @algor.next_token
+        tok = @algor.next_token
+        if tok
+          tok = ::Ferret::Analysis::Token.new(tok.text, tok.start, tok.end)
+        end
+        tok
       end
       # Get the text being tokenized

data/lib/rmmseg/token.rb CHANGED Viewed

@@ -18,9 +18,6 @@ module RMMSeg
     # token. This is *byte* index instead of character.
     attr_accessor :end
-    # See Ferret document for Token.
-    attr_accessor :pos_inc
     # +text+ is the ref to the whole text. In other words:
     # +text[start_pos...end_pos]+ should be the string held by this
     # token.
@@ -28,23 +25,7 @@ module RMMSeg
       @text = text
       @start = start_pos
       @end = end_pos
-      @pos_inc = 1
-    end
-    def <=> other
-      if @start > other.start
-        return 1
-      elsif @start < other.start
-        return -1
-      elsif @end > other.end
-        return 1
-      elsif @end < other.end
-        return -1
-      else
-        return @text <=> other.text
-      end
     end
-    include Comparable
     def to_s
       @text.dup

data/lib/rmmseg.rb CHANGED Viewed

@@ -6,7 +6,7 @@ require 'rmmseg/simple_algorithm'
 require 'rmmseg/complex_algorithm'
 module RMMSeg
-  VERSION = '0.1.4'
+  VERSION = '0.1.5'
   # Segment +text+ using the algorithm configured.
   def segment(text)

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: rmmseg
 version: !ruby/object:Gem::Version
-  version: 0.1.4
+  version: 0.1.5
 platform: ruby
 authors:
 - pluskid
@@ -9,11 +9,11 @@ autorequire:
 bindir: bin
 cert_chain: []
-date: 2008-03-02 00:00:00 +00:00
+date: 2008-03-04 00:00:00 +00:00
 default_executable:
 dependencies: []
-description: "RMMSeg is an implementation of MMSEG Chinese word segmentation algorithm. It is based on two variants of maximum matching algorithms. Two algorithms are available for using:   * simple algorithm that uses only forward maximum matching. * complex algorithm that uses three-word chunk maximum matching and 3 aditonal rules to solve ambiguities.  For more information about the algorithm, please refer to the following essays:  * http://technology.chtsai.org/mmseg/ * http://pluskid.lifegoo.com/?p=261"
+description: "RMMSeg is an implementation of MMSEG Chinese word segmentation algorithm. It is based on two variants of maximum matching algorithms. Two algorithms are available for using:   * simple algorithm that uses only forward maximum matching. * complex algorithm that uses three-word chunk maximum matching and 3 additonal rules to solve ambiguities.  For more information about the algorithm, please refer to the following essays:  * http://technology.chtsai.org/mmseg/ * http://pluskid.lifegoo.com/?p=261"
 email: pluskid@gmail.com
 executables:
 - rmmseg