chinese_vocab 0.8.0 → 0.8.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/ChangeLog.md +9 -0
- data/README.md +37 -27
- data/lib/chinese/version.rb +1 -1
- metadata +11 -10
data/ChangeLog.md
ADDED
data/README.md
CHANGED
@@ -8,20 +8,20 @@
|
|
8
8
|
|
9
9
|
`Chinese::Vocab` addresses all of the above requirements by downloading sentences for each word and selecting the __minimum required number of Chinese sentences__ (and English translations) to __represent all words__.
|
10
10
|
|
11
|
-
You can then export the sentences as well as additional tags provided by `Chinese::Vocab` to Anki.
|
11
|
+
You can then export the sentences as well as additional tags provided by `Chinese::Vocab` to [Anki](http://ankisrs.net/).
|
12
12
|
|
13
13
|
## Features
|
14
14
|
|
15
15
|
* Downloads sentences for each word in a Chinese vocabulary list and selects the __minimum required number of sentences__ to represent all words.
|
16
16
|
* With the option key `:compact` set to `true` on initialization, all single character words that also appear in at least one multi character word are removed. The reason behind this option is to __remove redundancy in meaning__ and focus on learning distinct words. Example: (["看", "看书"] => [看书])
|
17
|
-
* Adds additional __tags__ to every sentence that can be used in
|
17
|
+
* Adds additional __tags__ to every sentence that can be used in [Anki](http://ankisrs.net/):
|
18
18
|
* __Pinyin__: By default the pinyin representation is added to each sentence. Example: "除了这张大钞以外,我没有其他零票了。" => "chú le zhè zhāng dà chāo yĭ wài ,wŏ méi yŏu qí tā líng piào le 。"
|
19
19
|
* __Number of target words__: The number of words from the vocabulary that are covered by a sentence. Example: "除了这张大钞以外,我没有其他零票了。" => "3_words"
|
20
20
|
* __List of target words__: A list of the words from the vocabulary that are covered by a sentence. Example: "除了这张大钞以外,我没有其他零票了。" => "[我, 他, 除了 以外]"
|
21
|
-
* Export data to csv for easy import from
|
21
|
+
* Export data to csv for easy import from [Anki](http://ankisrs.net/).
|
22
22
|
|
23
23
|
|
24
|
-
## Real World Example (
|
24
|
+
## Real World Example (Using the Traditional HSK Word List)
|
25
25
|
|
26
26
|
```` ruby
|
27
27
|
# Import words from source.
|
@@ -29,27 +29,52 @@ You can then export the sentences as well as additional tags provided by `Chines
|
|
29
29
|
# Second argument: column number of word column (counting starts at 1)
|
30
30
|
words = Chinese::Vocab.parse_words('../old_hsk_level_8828_chars_1_word_edited.csv', 4)
|
31
31
|
# Sample output:
|
32
|
-
words.take(6)
|
32
|
+
p words.take(6)
|
33
33
|
# => ["啊", "啊", "矮", "爱", "爱人", "安静"]
|
34
34
|
|
35
|
-
|
36
35
|
# Initialize an object.
|
37
36
|
# First argument: word list as an array of strings.
|
38
37
|
# Options:
|
39
38
|
# :compact (defaults to false)
|
40
39
|
anki = Chinese::Vocab.new(words, :compact => true)
|
41
40
|
|
42
|
-
# List all words
|
43
|
-
p anki.words.take(6)
|
44
|
-
# => ["啊", "啊", "矮", "爱", "爱人", "安静"]
|
45
|
-
p anki.words.size
|
46
|
-
# => 7251
|
47
|
-
|
48
41
|
# Options:
|
49
42
|
# :source (defaults to :nciku)
|
50
43
|
# :size (defaults to :short)
|
51
44
|
# :with_pinyin (defaults to true)
|
52
45
|
anki.min_sentences(:thread_count => 10)
|
46
|
+
# Sample output:
|
47
|
+
# [{:word=>"吧", :chinese=>"放心吧,他做事向来把牢。",
|
48
|
+
# :pinyin=>"fàng xīn ba ,tā zuò shì xiàng lái bă láo 。",
|
49
|
+
# :english=>"Take it easy. You can always count on him."},
|
50
|
+
# {:word=>"喝", :chinese=>"喝酒挂红的人一般都很能喝。",
|
51
|
+
# :pinyin=>"hē jiŭ guà hóng de rén yī bān dōu hĕn néng hē 。",
|
52
|
+
# :english=>"Those whose face turn red after drinking are normally heavy drinkers."}]
|
53
|
+
|
54
|
+
# Save data to csv.
|
55
|
+
# First parameter: path to file
|
56
|
+
# Options:
|
57
|
+
# Any supported option of Ruby's CSV libary
|
58
|
+
anki.to_csv('in_the_wild_test.csv')
|
59
|
+
# Sample output (2 sentences/lines out of 4511):
|
60
|
+
|
61
|
+
# 只要我们有信心,就会战胜困难。,zhī yào wŏ men yŏu xìn xīn ,jiù huì zhàn shèng kùn nán 。,
|
62
|
+
# "As long as we have confidence, we can overcome difficulties.",
|
63
|
+
# 5_words,"[信心, 只要, 困难, 我们, 战胜]"
|
64
|
+
# 至于他什么时候回来,我不知道。,zhì yú tā shén mo shí hòu huí lái ,wŏ bù zhī dào 。,
|
65
|
+
# "As to what time he's due back, I'm just not sure.",
|
66
|
+
# 5_words,"[什么, 回来, 时候, 知道, 至于]"
|
67
|
+
````
|
68
|
+
|
69
|
+
#### Additional methods
|
70
|
+
|
71
|
+
```` ruby
|
72
|
+
# List all words
|
73
|
+
p anki.words.take(6)
|
74
|
+
# => ["啊", "啊", "矮", "爱", "爱人", "安静"]
|
75
|
+
|
76
|
+
p anki.words.size
|
77
|
+
# => 7251
|
53
78
|
|
54
79
|
p anki.stored_sentences.take(2)
|
55
80
|
# [{:word=>"吧", :chinese=>"放心吧,他做事向来把牢。",
|
@@ -66,21 +91,6 @@ p anki.not_found
|
|
66
91
|
# Number of unique characters in the selected sentences
|
67
92
|
p anki.sentences_unique_chars.size
|
68
93
|
# => 3290
|
69
|
-
|
70
|
-
# Save data to csv.
|
71
|
-
# First parameter: path to file
|
72
|
-
# Options:
|
73
|
-
# Any supported option of Ruby's CSV libary
|
74
|
-
anki.to_csv('in_the_wild_test.csv')
|
75
|
-
# Sample output (2 sentences/lines out of 4511):
|
76
|
-
|
77
|
-
# 舞台上正在上演的是吕剧。,wŭ tái shàng zhèng zài shàng yăn de shì lǚ jù 。,
|
78
|
-
# What is being performed on the stage is Lv opera (a local opera of Shandong Province).
|
79
|
-
# ,2_words,"[正在, 舞台]"
|
80
|
-
# 古代官员上朝都要穿朝靴。,gŭ dài guān yuán shàng cháo dōu yào chuān cháo xuē 。,
|
81
|
-
# "In ancient times, all courtiers had to wear special boots to enter the court.",
|
82
|
-
# 2_words,"[古代, 官员]"
|
83
|
-
|
84
94
|
````
|
85
95
|
|
86
96
|
## Documentation
|
data/lib/chinese/version.rb
CHANGED
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: chinese_vocab
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.8.
|
4
|
+
version: 0.8.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -13,7 +13,7 @@ date: 2012-04-13 00:00:00.000000000Z
|
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: with_validations
|
16
|
-
requirement: &
|
16
|
+
requirement: &21585540 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,10 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *21585540
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: nokogiri
|
27
|
-
requirement: &
|
27
|
+
requirement: &21585060 !ruby/object:Gem::Requirement
|
28
28
|
none: false
|
29
29
|
requirements:
|
30
30
|
- - ! '>='
|
@@ -32,10 +32,10 @@ dependencies:
|
|
32
32
|
version: '0'
|
33
33
|
type: :runtime
|
34
34
|
prerelease: false
|
35
|
-
version_requirements: *
|
35
|
+
version_requirements: *21585060
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: string_to_pinyin
|
38
|
-
requirement: &
|
38
|
+
requirement: &21584620 !ruby/object:Gem::Requirement
|
39
39
|
none: false
|
40
40
|
requirements:
|
41
41
|
- - ! '>='
|
@@ -43,10 +43,10 @@ dependencies:
|
|
43
43
|
version: '0'
|
44
44
|
type: :runtime
|
45
45
|
prerelease: false
|
46
|
-
version_requirements: *
|
46
|
+
version_requirements: *21584620
|
47
47
|
- !ruby/object:Gem::Dependency
|
48
48
|
name: rspec
|
49
|
-
requirement: &
|
49
|
+
requirement: &21584200 !ruby/object:Gem::Requirement
|
50
50
|
none: false
|
51
51
|
requirements:
|
52
52
|
- - ! '>='
|
@@ -54,8 +54,8 @@ dependencies:
|
|
54
54
|
version: '0'
|
55
55
|
type: :development
|
56
56
|
prerelease: false
|
57
|
-
version_requirements: *
|
58
|
-
description: ! '===
|
57
|
+
version_requirements: *21584200
|
58
|
+
description: ! '===
|
59
59
|
|
60
60
|
This gem is meant to make live easier for any Chinese language student who:
|
61
61
|
|
@@ -88,6 +88,7 @@ files:
|
|
88
88
|
- lib/chinese/modules/helper_methods.rb
|
89
89
|
- lib/chinese/vocab.rb
|
90
90
|
- lib/chinese/scraper.rb
|
91
|
+
- ChangeLog.md
|
91
92
|
- README.md
|
92
93
|
- Rakefile
|
93
94
|
- LICENSE
|