auto-correct 0.1.0.pre0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 764ae2dccd3a0b65e9e5b071a3568620ade98d52
4
- data.tar.gz: 01a7ea3ed26327875a22530df5722af6a5253f14
2
+ SHA256:
3
+ metadata.gz: 89d9754d3ecd0a18d8ef8ee8f245f2ff0b217873ae7b09bb0e4759759a297878
4
+ data.tar.gz: f34f2c046802a275447e0f602442406aeafd7adee6e27fd6647405ac34e81499
5
5
  SHA512:
6
- metadata.gz: 8253513e4b424534a08ef122023b2d7c03c1f1fe518f4028bd0658e87884eee5bba95e9f002b0dd915c0f57bf08e8fc7f06c6710c15cd3a1020ec8f55a0ef494
7
- data.tar.gz: 00fb083fcf99ddfbd779a57ad0a847ffc41e7dbff551ada550a35ba6a8118d50290af6c880897e97c549f2bb0425a56b54a44de7b14b356114990014863205cf
6
+ metadata.gz: fba0aafa39062de04a459fadbdb23c3893b7e69dd09c343ec3e16ba41e5d776e3538475ecd6dd2335d9bd0de24bf15d58ad3c0490f87a97f2731fa169cc6dd01
7
+ data.tar.gz: 72c1c012a63f9ebfb8289b7ce2e630d79aa47085720b8a98816b0763318d24460b22633fc0d8ffc1b012864bf660d69f4b9647fb10390d1643ddb8cd94e0b0a8
data/README.md CHANGED
@@ -1,62 +1,61 @@
1
1
  # auto-correct
2
2
 
3
+ Automatically add spaces between Chinese and English words.
4
+
3
5
  自动纠正中文英文混排是一些不够好的写法,纠正错误的名词大小写。
4
6
 
5
- Before
7
+ [![Gem Version](https://badge.fury.io/rb/auto-correct.svg)](https://rubygems.org/gems/auto-correct) [![Build
8
+ Status](https://api.travis-ci.org/huacnlee/auto-correct.svg?branch=master&.svg)](http://travis-ci.org/huacnlee/auto-correct) [![Code Climate](https://codeclimate.com/github/huacnlee/auto-correct/badges/gpa.svg)](https://codeclimate.com/github/huacnlee/auto-correct)
6
9
 
7
- ```
8
- [经验之谈]转行做ruby程序员的8个月, mysql 经验
9
- ```
10
10
 
11
- After
11
+ ## Other implements
12
12
 
13
- ```
14
- [经验之谈] 转行做 Ruby 程序员的 8 个月, MySQL 经验
15
- ```
13
+ - [auto-correct](https://github.com/huacnlee/auto-correct) - Ruby
14
+ - [go-auto-correct](https://github.com/huacnlee/go-auto-correct) - Go
16
15
 
17
- [![Gem Version](https://badge.fury.io/rb/auto-space.png)](https://rubygems.org/gems/auto-space) [![Build
18
- Status](https://secure.travis-ci.org/huacnlee/auto-space.png?branch=master&.png)](http://travis-ci.org/huacnlee/auto-space)
16
+ ## Features
19
17
 
20
- ## 使用说明
18
+ - Auto add spacings between Chinese and English words.
19
+ - HTML content support.
21
20
 
22
- ```irb
23
- irb> require 'auto-correct'
24
- true
21
+ ## Usage
25
22
 
26
- irb> "关于SSH连接的Permission denied(publickey).".auto_space!
27
- 关于 SSH 连接的 Permission denied (publickey).
23
+ `AutoCorrect.format` method for plain text.
28
24
 
29
- irb> "怎样追踪一个repo的新feature 和进展呢?".auto_space!
30
- 怎样追踪一个 repo 的新 feature 和进展呢?
25
+ ```ruby
26
+ AutoCorrect.format("关于ssh连接的Permission denied(publickey).")
27
+ # => "关于 SSH 连接的 Permission denied (publickey)."
31
28
 
32
- irb> "vps上sessions不生效,但在本地的环境是ok的,why?".auto_space!
33
- vps sessions 不生效,但在本地的环境是 OK 的,why?
29
+ AutoCorrect.format("怎样追踪一个repo的新feature 和进展呢?")
30
+ # => "怎样追踪一个 repo 的新 feature 和进展呢?"
34
31
 
35
- irb> "bootstrap control-group对齐问题".auto_space!
36
- bootstrap control-group 对齐问
37
- ```
32
+ AutoCorrect.format("vps上sessions不生效,但在本地的环境是ok的,why?")
33
+ # => "VPS 上 sessions 不生效,但在本地的环境是 OK 的,why?"
38
34
 
39
- ## 性能
35
+ AutoCorrect.format("bootstrap control-group对齐问题")
36
+ # => "Bootstrap control-group 对齐问"
37
+ ```
40
38
 
41
- 详见 Rakefile
39
+ `AutoCorrect.format_html` method for HTML content.
42
40
 
41
+ ```ruby
42
+ AutoCorrect.format_html("<div><p>长桥LongBridge App下载</p><p>最新版本1.0</p></div>")
43
+ # => "<div><p>长桥 LongBridge App 下载</p><p>最新版本 1.0</p></div>"
43
44
  ```
44
- $ rake benchmark
45
- user system total real
46
- 100 times 0.000000 0.000000 0.000000 ( 0.002223)
47
- 1000 times 0.030000 0.000000 0.030000 ( 0.024711)
48
- 10000 times 0.230000 0.000000 0.230000 ( 0.240850)
49
- ```
50
45
 
51
- ## TODO
46
+ ## Benchmark
47
+
48
+ TODO
52
49
 
53
- * 'Foo'的"Bar" -> 'Foo' 的 "Bar"
54
- * 什么,时候 -> 什么, 时候 -> 什么,时候
55
50
 
56
- ## 应用案例
51
+ ## Use cases
57
52
 
58
53
  * [Ruby China](http://ruby-china.org) - 目前整站的标题都做了自动转换处理。
59
54
 
60
- ## 参考内容
55
+ ## Links
61
56
 
62
57
  * [Chinese Copywriting Guidelines](https://github.com/sparanoid/chinese-copywriting-guidelines)
58
+
59
+ ## License
60
+
61
+ This project under MIT license.
@@ -1,40 +1,11 @@
1
- # coding: utf-8
2
- require "auto-correct/dicts"
1
+ require "auto-correct/strategery"
2
+ require "auto-correct/base"
3
+ require "auto-correct/format"
4
+ require "auto-correct/html"
5
+ require "auto-correct/string"
6
+ require "auto-correct/version"
3
7
 
4
- class String
5
- def auto_space!
6
- self.gsub! /((?![年月日号])\p{Han})([a-zA-Z0-9+$@#\[\(\/‘“])/u do
7
- "#$1 #$2"
8
- end
9
-
10
- self.gsub! /([a-zA-Z0-9+$’”\]\)@#!\/]|[\d[年月日]]{2,})((?![年月日号])\p{Han})/u do
11
- "#$1 #$2"
12
- end
13
-
14
- # Fix () [] near the English and number
15
- self.gsub! /([a-zA-Z0-9]+)([\[\(‘“])/u do
16
- "#$1 #$2"
17
- end
18
-
19
- self.gsub! /([\)\]’”])([a-zA-Z0-9]+)/u do
20
- "#$1 #$2"
21
- end
22
-
23
- self
24
- end
25
-
26
- def auto_correct!
27
- self.auto_space!
28
-
29
- self.gsub! /([\d\p{Han}]|\s|^)([a-zA-Z\d\-\_\.]+)([\d\p{Han}]|\s|$)/u do
30
- key = "#$2".downcase
31
- if AutoCorrect::DICTS.has_key?(key)
32
- ["#$1",AutoCorrect::DICTS[key],"#$3"].join("")
33
- else
34
- "#$1#$2#$3"
35
- end
36
- end
37
-
38
- self
39
- end
8
+ class AutoCorrect
40
9
  end
10
+
11
+ String.send :include, AutoCorrect::String
@@ -0,0 +1,13 @@
1
+ class AutoCorrect
2
+ @@strategies = []
3
+
4
+ class << self
5
+ def rule(one, other, space: false, reverse: false)
6
+ @@strategies << AutoCorrect::Strategery.new(one, other, space: space, reverse: reverse)
7
+ end
8
+
9
+ def strategies
10
+ @@strategies
11
+ end
12
+ end
13
+ end
@@ -0,0 +1,36 @@
1
+ class AutoCorrect
2
+ # rubocop:disable Style/StringLiterals
3
+ # EnglishLetter
4
+ rule '\p{Han}', '[0-9a-zA-Z]', space: true, reverse: true
5
+
6
+ # SpecialSymbol
7
+ rule '\p{Han}', '[\|+$@#]', space: true, reverse: true
8
+ rule '\p{Han}', '[\[\(‘“]', space: true
9
+ rule '[’”\]\)!%]', '\p{Han}', space: true
10
+ rule '[”\]\)!]', '[a-zA-Z0-9]+', space: true
11
+
12
+ # FullwidthPunctuation
13
+ rule '[\w\p{Han}]', '[,。!?:;」》】”’]', reverse: true
14
+ rule '[‘“【「《]', '[\w\p{Han}]', reverse: true
15
+
16
+ class << self
17
+ FULLDATE_RE = /[\s]{0,}\d+[\s]{0,}年[\s]{0,}\d+[\s]{0,}月[\s]{0,}\d+[\s]{0,}[日号][\s]{0,}/u
18
+
19
+ def format(str)
20
+ out = str
21
+ self.strategies.each do |s|
22
+ out = s.format(out)
23
+ end
24
+ out = remove_full_date_spacing(out)
25
+ out
26
+ end
27
+
28
+ private
29
+
30
+ def remove_full_date_spacing(str)
31
+ str.gsub(FULLDATE_RE) do |m|
32
+ m.gsub(/\s+/, "")
33
+ end
34
+ end
35
+ end
36
+ end
@@ -0,0 +1,14 @@
1
+ require "nokogiri"
2
+
3
+ class AutoCorrect
4
+ class << self
5
+ def format_html(html)
6
+ doc = Nokogiri::HTML(html)
7
+ doc.traverse do |node|
8
+ next unless node.node_type == Nokogiri::XML::Node::TEXT_NODE
9
+ node.content = AutoCorrect.format(node.content)
10
+ end
11
+ doc.css("body").inner_html
12
+ end
13
+ end
14
+ end
@@ -0,0 +1,43 @@
1
+ class AutoCorrect
2
+ class Strategery
3
+ attr_reader :space, :reverse
4
+ attr_reader :add_space_rules, :remove_space_rules
5
+
6
+ def initialize(one, other, space: false, reverse: false)
7
+ @space = space
8
+ @reverse = reverse
9
+
10
+ @add_space_rules = [
11
+ /(#{one})(#{other})/u,
12
+ /(#{other})(#{one})/u
13
+ ]
14
+
15
+ @remove_space_rules = [
16
+ /(#{one})\s+(#{other})/u,
17
+ /(#{other})\s+(#{one})/u
18
+ ]
19
+ end
20
+
21
+ def format(str)
22
+ self.space ? add_space(str) : remove_space(str)
23
+ end
24
+
25
+ def add_space(str)
26
+ r0, r1 = add_space_rules
27
+ str = str.gsub(r0) { "#$1 #$2" }
28
+ if self.reverse
29
+ str = str.gsub(r1) { "#$1 #$2" }
30
+ end
31
+ str
32
+ end
33
+
34
+ def remove_space(str)
35
+ r0, r1 = remove_space_rules
36
+ str = str.gsub(r0) { "#$1 #$2" }
37
+ if self.reverse
38
+ str = str.gsub(r1) { "#$1 #$2" }
39
+ end
40
+ str
41
+ end
42
+ end
43
+ end
@@ -0,0 +1,13 @@
1
+ class AutoCorrect
2
+ module String
3
+ def auto_space!
4
+ ActiveSupport::Deprecation.warn("String.auto_space! is deprecated and will be removed in auto-corrrect 1.0, please use AutoCorrect.format instead.")
5
+ self.sub!(self, AutoCorrect.format(self))
6
+ end
7
+
8
+ def auto_correct!
9
+ ActiveSupport::Deprecation.warn("String.auto_correct! is deprecated and will be removed in auto-corrrect 1.0, please use AutoCorrect.format instead.")
10
+ self.sub!(self, AutoCorrect.format(self))
11
+ end
12
+ end
13
+ end
@@ -0,0 +1,3 @@
1
+ class AutoCorrect
2
+ VERSION = "0.2.0"
3
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: auto-correct
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0.pre0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Luikore
@@ -9,23 +9,24 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2014-07-15 00:00:00.000000000 Z
12
+ date: 2020-01-09 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
- name: activesupport
15
+ name: nokogiri
16
16
  requirement: !ruby/object:Gem::Requirement
17
17
  requirements:
18
- - - ">"
18
+ - - ">="
19
19
  - !ruby/object:Gem::Version
20
- version: 3.0.0
20
+ version: '1.4'
21
21
  type: :runtime
22
22
  prerelease: false
23
23
  version_requirements: !ruby/object:Gem::Requirement
24
24
  requirements:
25
- - - ">"
25
+ - - ">="
26
26
  - !ruby/object:Gem::Version
27
- version: 3.0.0
28
- description: "自动给中文英文之间加入合理的空格"
27
+ version: '1.4'
28
+ description: Automatically add whitespace between Chinese and and half-width characters
29
+ (alphabetical letters, numerical digits and symbols).
29
30
  email:
30
31
  - usurffx@gmail.com
31
32
  - huacnlee@gmail.com
@@ -35,7 +36,12 @@ extra_rdoc_files: []
35
36
  files:
36
37
  - README.md
37
38
  - lib/auto-correct.rb
38
- - lib/auto-correct/dicts.rb
39
+ - lib/auto-correct/base.rb
40
+ - lib/auto-correct/format.rb
41
+ - lib/auto-correct/html.rb
42
+ - lib/auto-correct/strategery.rb
43
+ - lib/auto-correct/string.rb
44
+ - lib/auto-correct/version.rb
39
45
  homepage: https://github.com/huacnlee/auto-correct
40
46
  licenses: []
41
47
  metadata: {}
@@ -50,14 +56,13 @@ required_ruby_version: !ruby/object:Gem::Requirement
50
56
  version: '0'
51
57
  required_rubygems_version: !ruby/object:Gem::Requirement
52
58
  requirements:
53
- - - ">"
59
+ - - ">="
54
60
  - !ruby/object:Gem::Version
55
- version: 1.3.1
61
+ version: '0'
56
62
  requirements: []
57
- rubyforge_project:
58
- rubygems_version: 2.2.2
63
+ rubygems_version: 3.0.3
59
64
  signing_key:
60
65
  specification_version: 4
61
- summary: "自动给中文英文之间加入合理的空格"
66
+ summary: Automatically add whitespace between Chinese and and half-width characters
67
+ (alphabetical letters, numerical digits and symbols).
62
68
  test_files: []
63
- has_rdoc:
@@ -1,103 +0,0 @@
1
- module AutoCorrect
2
- DICTS = {
3
- # Ruby
4
- "ruby" => "Ruby",
5
- "rails" => "Rails",
6
- "rubygems" => "RubyGems",
7
- "ror" => "Ruby on Rails",
8
- "rubyconf" => "RubyConf",
9
- "railsconf" => "RailsConf",
10
- "rubytuesday" => "Ruby Tuesday",
11
- "jruby" => "JRuby",
12
- "mruby" => "mRuby",
13
- "rvm" => "RVM",
14
- "rbenv" => "rbenv",
15
- "yard" => "YARD",
16
- "rdoc" => "RDoc",
17
- "rspec" => "RSpec",
18
- "minitest" => "MiniTest",
19
- "coffeescript" => "CoffeeScript",
20
- "scss" => "SCSS",
21
- "sass" => "Sass",
22
- "sidekiq" => "Sidekiq",
23
- "railscasts" => "RailsCasts",
24
- "execjs" => "ExecJS",
25
-
26
- # Python
27
-
28
- # Node.js
29
- "nodejs" => "Node.js",
30
-
31
- # Go
32
-
33
- # Cocoa
34
- "reactivecocoa" => "ReactiveCocoa",
35
-
36
- # Programming
37
- "ssh" => "SSH",
38
- "css" => "CSS",
39
- "html" => "HTML",
40
- "javascript" => "JavaScript",
41
- "js" => "JS",
42
- "png" => "PNG",
43
- "dsl" => "DSL",
44
- "tdd" => "TDD",
45
- "bdd" => "BDD",
46
-
47
- # Sites
48
- "github" => "GitHub",
49
- "gist" => "Gist",
50
- "ruby_china" => "Ruby China",
51
- "ruby-china" => "Ruby China",
52
- "rubychina" => "Ruby China",
53
- "v2ex" => "V2EX",
54
- "heroku" => "Heroku",
55
- "stackoverflow" => "Stack Overflow",
56
- "stackexchange" => "StackExchange",
57
-
58
-
59
- # Databases
60
- "mysql" => "MySQL",
61
- "postgresql" => "PostgreSQL",
62
- "sqlite" => "SQLite",
63
- "mongodb" => "MongoDB",
64
- "rethinkdb" => "RethinkDB",
65
- "elasticsearch" => "Elasticsearch",
66
- "sphinx" => "Sphinx",
67
-
68
- # OpenSource Projects
69
- "gitlab" => "GitLab",
70
- "gitlabci" => "GitLab CI",
71
- "fontawsome" => "FontAwsome",
72
- "bootstrap" => "Bootstrap",
73
- "less" => "Less",
74
- "jquery" => "jQuery",
75
- "requirejs" => "RequireJS",
76
- "underscore" => "Underscore",
77
- "backbone" => "Backbone",
78
- "seajs" => "SeaJS",
79
- "imagemagick" => "ImageMagick",
80
-
81
- # Tools
82
- "vim" => "VIM",
83
- "emacs" => "Emacs",
84
- "textmate" => "TextMate",
85
- "sublime" => "Sublime",
86
- "rubymine" => "RubyMine",
87
- "sequelpro" => "Sequel Pro",
88
- "virtualbox" => "VirtualBox",
89
- "safari" => "Safari",
90
- "chrome" => "Chrome",
91
- "ie" => "IE",
92
-
93
- # Misc
94
- "ios" => "iOS",
95
- "iphone" => "iPhone",
96
- "android" => "Android",
97
- "osx" => "OS X",
98
- "mac" => "Mac",
99
- "api" => "API",
100
- "wi-fi" => "Wi-Fi",
101
- "wifi" => "Wi-Fi"
102
- }
103
- end