baidu_ocr 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 4095080bad105c62e7e68123d07e9d56953330ff
4
+ data.tar.gz: 6a1b9f9ca4ad27e87ff5baabe31293e002814bda
5
+ SHA512:
6
+ metadata.gz: 3daac9c054a45045714ea665bfbd65b2ffaed4a50c3a90aaf70a565dd88cf045ebe661468a88838d88e944c4761f977809a6e4e43830410f07e9f2ad96696a3e
7
+ data.tar.gz: 4a42bce2a7d7ef1d71c6a7b2775fa0767e6299dcbca587a1009ba1005b4d2d7c4759d55d5978524b4a956b3cef69999f2bb3c9d779617b4f7df9910291768c0c
data/.gitignore ADDED
@@ -0,0 +1,9 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --format documentation
2
+ --color
data/.travis.yml ADDED
@@ -0,0 +1,3 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.2.0
data/Gemfile ADDED
@@ -0,0 +1,5 @@
1
+ source 'https://ruby.taobao.org'
2
+
3
+ # Specify your gem's dependencies in baidu_ocr.gemspec
4
+ gemspec :name => 'baidu_ocr'
5
+
data/README.md ADDED
@@ -0,0 +1,113 @@
1
+ # BaiduOcr
2
+
3
+ 百度OCR文字识别[API](http://apistore.baidu.com/apiworks/servicedetail/146.html) For Ruby
4
+
5
+ ## Installation
6
+
7
+ Add this line to your application's Gemfile:
8
+
9
+ ```ruby
10
+ gem 'baidu_ocr', :git => 'https://github.com/rudyboy/baidu_ocr'
11
+ ```
12
+
13
+ And then execute:
14
+
15
+ $ bundle
16
+
17
+
18
+ ## Usage
19
+
20
+ ``` ruby
21
+ require 'baidu_ocr'
22
+
23
+ # local image && imagetype = 1
24
+ BaiduOcr.init_baidu_ocr(apikey: 'your api',
25
+ image: 'bit.jpg',
26
+ imagetype: 1)
27
+ puts BaiduOcr.recognize
28
+
29
+ # local image && imagetype = 2
30
+ BaiduOcr.init_baidu_ocr(apikey: 'your api',
31
+ image: 'bit.jpg',
32
+ imagetype: 2)
33
+ puts BaiduOcr.recognize
34
+
35
+ # image from web && imagetype = 1
36
+ BaiduOcr.init_baidu_ocr(apikey: 'your api',
37
+ image: 'https://raw.githubusercontent.com/rudyboy/baidu_ocr/master/examples/bit.jpg',
38
+ imagetype: 1)
39
+ puts BaiduOcr.recognize
40
+
41
+ # image from web && imagetype = 2
42
+ BaiduOcr.init_baidu_ocr(apikey: 'your api',
43
+ image: 'https://raw.githubusercontent.com/rudyboy/baidu_ocr/master/examples/bit.jpg',
44
+ imagetype: 2)
45
+ puts BaiduOcr.recognize
46
+
47
+ ```
48
+
49
+ ## 结果对比
50
+
51
+ >1. 测试结果(截图百度百科,原文字图片好些效果更佳):
52
+
53
+ ![测试图片](https://github.com/rudyboy/baidu_ocr/blob/master/examples/bit.jpg?raw=true)
54
+
55
+ ```
56
+ 北京理工大学(R声gh汕吧y)是中华人民共和国工业和信息化部直属的―所以理工科为主干,
57
+ 工、理,管、文协调发展的全国重点大学,是国家21工程”、吗舒工程”首批重点建设高校,
58
+ 是中俄工科大学联盟”成员,入选“m计,划、“211计划,“卓越工程师教育培养计划`
59
+ “国家建设高水平大学公派研究生项目”,为中管副部级高校,设有研究生院,
60
+
61
+ 学校前身北京工业学院发源于19和年在延安成立的延安自然科学院,
62
+ 是中国共产党创办的第一所理工科大学;1988年,学校:更名为北京理工大学,
63
+
64
+ 截止加13年12月31日.该校有全日制在校生28674人.其中本科生14774人.硕士生为59人.
65
+ 博士生323人.1|学校拥有中,关村校区良乡校区、西山实验区珠海校区和秦皇岛分校;
66
+ 设有19个专业学院和教育研究院,基础教育学院、继续教育学院高等职业技术学院及珠海学院
67
+
68
+ M12年,学校首次进入QS世界大学排名“亚洲大学1∞强”和世界大学W强”,
69
+ 在入选的9所中国高校中名列第B位(并,列
70
+ ```
71
+
72
+ >2 这个测试结果的正确率相当高
73
+
74
+ ![测试图片](https://github.com/rudyboy/baidu_ocr/blob/master/examples/test.jpg?raw=true)
75
+
76
+ ```
77
+ 真金不怕火炼,百炼才能成钢,
78
+ 烈火无情的撕咬着钢的肌肤,而他,却一声不吭,他终,
79
+ 究是沉默的,可在这死一般寂静的黑夜里,黎明上的到
80
+ 来,毕竟是无法抗拒的,不是在沉默中爆发,就是在先沉默:
81
+ 中灭亡,他选择了坚持,因为他有一副铿铿锵锵的骨骼,他,
82
+ 有一条不灭的灵魂!所以,他战胜了黑夜,无情上的火屈服
83
+ 在他的脚下,他迎来了光明,他挺直的腰板,闪烁上的光;
84
+ 明,向世界宣告他的成功,
85
+ 风雨如罄上的时代,法西斯侵占了俄国的领土,他们妄:
86
+ 想征服全世界!他想让人们拜倒在他们脚下,可是,他们想,
87
+ 错了,人们不可能永远充实当傀儡!会有站起来的那一天:
88
+ 的人们盼望着一声春雷,一声响彻大地的春雷,一个革命,
89
+ 新的开始,
90
+ 保尔就生长在俄国与法西斯战争及其艰苦上的环境:
91
+ 中,他的一生是与挫折,困难做斗争的一生,他儿时被迫成:
92
+ 了童工,尝尽了人间的冷暖,是什么让他一次次硬挺下,
93
+ 去?是对,是钢铁般上的毅力,是烈火焚烧若等闲的信
94
+ 念!柯察金的青年,是冲破了拂晓的先沉默,是黎明上的到
95
+ 来,俄国人民崛起,为法西斯即来的灭亡敲响了丧钟。
96
+ ```
97
+ ## Contributing
98
+
99
+ 1. Fork it ( https://github.com/rudyboy/baidu_ocr/fork )
100
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
101
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
102
+ 4. Push to the branch (`git push origin my-new-feature`)
103
+ 5. Create a new Pull Request
104
+
105
+ ## LICENSE
106
+
107
+ The MIT License (MIT) Copyright (c) 2015 rudyboy
108
+
109
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
110
+
111
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
112
+
113
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/Rakefile ADDED
@@ -0,0 +1 @@
1
+ require "bundler/gem_tasks"
data/baidu_ocr.gemspec ADDED
@@ -0,0 +1,32 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'baidu_ocr/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "baidu_ocr"
8
+ spec.version = BaiduOcr::VERSION
9
+ spec.authors = ["rudyboy"]
10
+ spec.email = ["useyes91@gmail.com"]
11
+
12
+ spec.summary = %q{add rspec and base image for ocr.}
13
+ spec.description = %q{百度OCR文字识别API For Ruby Gems
14
+ http://apistore.baidu.com/apiworks/servicedetail/146.html}
15
+ spec.homepage = "https://github.com/rudyboy/baidu_ocr.git"
16
+ spec.license = "MIT"
17
+
18
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
19
+ spec.bindir = "exe"
20
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
21
+ spec.require_paths = ["lib"]
22
+
23
+ if spec.respond_to?(:metadata)
24
+ spec.metadata['allowed_push_host'] = "https://rubygems.org"
25
+ end
26
+
27
+ spec.add_development_dependency "bundler", "~> 1.9"
28
+ spec.add_development_dependency "rake", "~> 10.0"
29
+
30
+ spec.add_runtime_dependency "rest-client", "~> 1.7"
31
+ spec.add_runtime_dependency "json", "~> 1.8"
32
+ end
data/bin/console ADDED
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "baidu_ocr"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start
data/bin/setup ADDED
@@ -0,0 +1,7 @@
1
+ #!/bin/bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+
5
+ bundle install
6
+
7
+ # Do any other automated setup that you need to do here
data/examples/bit.jpg ADDED
Binary file
@@ -0,0 +1,33 @@
1
+ require '../lib/baidu_ocr'
2
+
3
+ # local image && imagetype = 1
4
+ BaiduOcr.init_baidu_ocr(apikey: 'your api',
5
+ image: 'bit.jpg',
6
+ imagetype: 1)
7
+ puts BaiduOcr.recognize
8
+
9
+ puts '-' * 30
10
+
11
+ # local image && imagetype = 2
12
+ BaiduOcr.init_baidu_ocr(apikey: 'your api',
13
+ image: 'bit.jpg',
14
+ imagetype: 2)
15
+ puts BaiduOcr.recognize
16
+
17
+ puts '-' * 30
18
+
19
+ # image from web && imagetype = 1
20
+ BaiduOcr.init_baidu_ocr(apikey: 'your api',
21
+ image: 'https://raw.githubusercontent.com/rudyboy/baidu_ocr/master/examples/bit.jpg',
22
+ imagetype: 1)
23
+ puts BaiduOcr.recognize
24
+
25
+ puts '-' * 30
26
+
27
+ # image from web && imagetype = 2
28
+ BaiduOcr.init_baidu_ocr(apikey: 'your api',
29
+ image: 'https://raw.githubusercontent.com/rudyboy/baidu_ocr/master/examples/bit.jpg',
30
+ imagetype: 2)
31
+ puts BaiduOcr.recognize
32
+
33
+
data/examples/test.jpg ADDED
Binary file
@@ -0,0 +1,21 @@
1
+ require 'base64'
2
+
3
+ module BaiduOcr
4
+ module EncodeImage
5
+
6
+ class << self
7
+
8
+ def encode(image)
9
+ begin
10
+ puts "loading file from #{image}"
11
+ stream = open(image, 'rb').read
12
+ rescue Errno::ENOENT => e
13
+ raise NotFound, "#{e.message}"
14
+ end
15
+ # ruby的Base64.encode64 有换行,好坑,无法满足要求
16
+ Base64.strict_encode64 stream
17
+ end
18
+ end
19
+
20
+ end
21
+ end
@@ -0,0 +1,17 @@
1
+ module BaiduOcr
2
+
3
+ class Exception < RuntimeError; end
4
+
5
+ class BlankArgsError < Exception
6
+ def initialize(blank_key)
7
+ super("Your args can not be blank. Please provide the #{blank_key}.")
8
+ end
9
+ end
10
+
11
+ class NotFound < Exception
12
+ def initialize(msg)
13
+ super("#{msg}")
14
+ end
15
+ end
16
+
17
+ end
@@ -0,0 +1,32 @@
1
+ require 'open-uri'
2
+ require 'tempfile'
3
+ require 'base64'
4
+
5
+ module BaiduOcr
6
+ module FileRead
7
+
8
+ class << self
9
+
10
+ def read(image)
11
+ # file_contents = open('local-file.jpg') { |f| f.read }
12
+ # web_contents = open('http://www.xxx.com.jpg') {|f| f.read }
13
+ tmpfile = Tempfile.new(["ocr", ".jpg"])
14
+ begin
15
+ # can not save image extension like '.jpg'
16
+ # 百度是靠文件名来判断的,暂时采用temfile来自定义文件名
17
+ # http://stackoverflow.com/questions/9940633/is-it-possible-to-have-open-uri-maintain-the-extension
18
+ puts "loading file from #{image}"
19
+ stream = open(image, 'rb').read
20
+ rescue Errno::ENOENT => e
21
+ raise NotFound, "#{e.message}"
22
+ end
23
+ tmpfile.write(stream)
24
+ file = open(tmpfile)
25
+ tmpfile.unlink
26
+
27
+ file
28
+ end
29
+ end
30
+
31
+ end
32
+ end
@@ -0,0 +1,29 @@
1
+ require 'rest-client'
2
+ require 'base64'
3
+
4
+ module BaiduOcr
5
+ module Request
6
+
7
+ class << self
8
+
9
+ API_URL = "http://apis.baidu.com/apistore/idlocr/ocr"
10
+
11
+ def send_request(opt = {})
12
+ headers = { apiKey: opt.delete(:apikey) }
13
+ params = opt
14
+ respone = RestClient.post API_URL, params, headers
15
+ if respone.code == 200
16
+ body = JSON.parse respone.body
17
+ return words(body['retData']) if body['errNum'].to_i == 0
18
+ puts "识别不成功: #{body['errMsg']}"
19
+ end
20
+ end
21
+
22
+ def words(text)
23
+ text.collect{|k| k['word']}.join("\n")
24
+ end
25
+
26
+ end
27
+
28
+ end
29
+ end
@@ -0,0 +1,34 @@
1
+ module BaiduOcr
2
+ module Settings
3
+
4
+ class << self
5
+ DEFAULT_OPTIONS = {
6
+ :fromdevice => 'pc',
7
+ :clientip => '10.10.10.0',
8
+ :detecttype => 'LocateRecognize',
9
+ :languagetype => "CHN_ENG",
10
+ :imagetype => "1",
11
+ :image => "",
12
+ :apikey => "your api key"
13
+ }
14
+
15
+ attr_reader :settings
16
+
17
+ REQUIRED_OPTION_KEYS = [:languagetype, :imagetype, :image, :apikey]
18
+
19
+ def set_baidu_ocr(opts = {})
20
+ @settings = DEFAULT_OPTIONS.merge!(opts)
21
+
22
+ REQUIRED_OPTION_KEYS.each do |opt|
23
+ raise BlankArgsError, opt if @settings[opt].to_s.strip.empty?
24
+ end
25
+ end
26
+
27
+ def update_image(image)
28
+ @settings[:image] = image
29
+ end
30
+
31
+ end
32
+
33
+ end
34
+ end
@@ -0,0 +1,3 @@
1
+ module BaiduOcr
2
+ VERSION = "0.1.0"
3
+ end
data/lib/baidu_ocr.rb ADDED
@@ -0,0 +1,32 @@
1
+ $:.unshift(File.dirname(__FILE__) + '/../lib')
2
+
3
+ require "baidu_ocr/version"
4
+ require "baidu_ocr/settings"
5
+ require "baidu_ocr/exceptions"
6
+ require "baidu_ocr/file_read"
7
+ require "baidu_ocr/request"
8
+ require "baidu_ocr/encode_image"
9
+
10
+ module BaiduOcr
11
+ class << self
12
+
13
+ def init_baidu_ocr(opt = {})
14
+ BaiduOcr::Settings.set_baidu_ocr opt
15
+ image = case opt[:imagetype]
16
+ when 1
17
+ BaiduOcr::EncodeImage.encode opt[:image]
18
+ when 2
19
+ BaiduOcr::FileRead.read opt[:image]
20
+ else
21
+ raise BlankArgsError, "imagetype"
22
+ end
23
+
24
+ BaiduOcr::Settings.update_image(image)
25
+ end
26
+
27
+ def recognize
28
+ BaiduOcr::Request.send_request(BaiduOcr::Settings.settings)
29
+ end
30
+
31
+ end
32
+ end
metadata ADDED
@@ -0,0 +1,122 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: baidu_ocr
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - rudyboy
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2015-06-09 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.9'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.9'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rest-client
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '1.7'
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '1.7'
55
+ - !ruby/object:Gem::Dependency
56
+ name: json
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '1.8'
62
+ type: :runtime
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '1.8'
69
+ description: |-
70
+ 百度OCR文字识别API For Ruby Gems
71
+ http://apistore.baidu.com/apiworks/servicedetail/146.html
72
+ email:
73
+ - useyes91@gmail.com
74
+ executables: []
75
+ extensions: []
76
+ extra_rdoc_files: []
77
+ files:
78
+ - ".gitignore"
79
+ - ".rspec"
80
+ - ".travis.yml"
81
+ - Gemfile
82
+ - README.md
83
+ - Rakefile
84
+ - baidu_ocr.gemspec
85
+ - bin/console
86
+ - bin/setup
87
+ - examples/bit.jpg
88
+ - examples/examples.rb
89
+ - examples/test.jpg
90
+ - lib/baidu_ocr.rb
91
+ - lib/baidu_ocr/encode_image.rb
92
+ - lib/baidu_ocr/exceptions.rb
93
+ - lib/baidu_ocr/file_read.rb
94
+ - lib/baidu_ocr/request.rb
95
+ - lib/baidu_ocr/settings.rb
96
+ - lib/baidu_ocr/version.rb
97
+ homepage: https://github.com/rudyboy/baidu_ocr.git
98
+ licenses:
99
+ - MIT
100
+ metadata:
101
+ allowed_push_host: https://rubygems.org
102
+ post_install_message:
103
+ rdoc_options: []
104
+ require_paths:
105
+ - lib
106
+ required_ruby_version: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - ">="
109
+ - !ruby/object:Gem::Version
110
+ version: '0'
111
+ required_rubygems_version: !ruby/object:Gem::Requirement
112
+ requirements:
113
+ - - ">="
114
+ - !ruby/object:Gem::Version
115
+ version: '0'
116
+ requirements: []
117
+ rubyforge_project:
118
+ rubygems_version: 2.4.6
119
+ signing_key:
120
+ specification_version: 4
121
+ summary: add rspec and base image for ocr.
122
+ test_files: []