kusari 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 1bd23a95575a6e870596ea70b119829c01cbeaf0
4
+ data.tar.gz: e52a64cf85a4993763df1121d9cd0b7adb53d6c2
5
+ SHA512:
6
+ metadata.gz: 8ead23f301fa3745b1dc25e19b11f7d225595868cd365b1c067c17ae7dc393502cbaed79dcd63ae121ba15a352dbd0971c1ae3b55a34570258ef168ca589f0db
7
+ data.tar.gz: 7afdbd2f11989ddcdde5ae67d8bcd3d59156e7fe1a4965c870827c699c6e9a6ef3d8b67965604eb5e6c13e24f9c1e9b077a9d01a6ca8a78c66e9b8b80e7fec79
@@ -0,0 +1,9 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --format documentation
2
+ --color
@@ -0,0 +1,4 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.2.3
4
+ before_install: gem install bundler -v 1.10.6
@@ -0,0 +1,13 @@
1
+ # Contributor Code of Conduct
2
+
3
+ As contributors and maintainers of this project, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities.
4
+
5
+ We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.
6
+
7
+ Examples of unacceptable behavior by participants include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct.
8
+
9
+ Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed from the project team.
10
+
11
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue or contacting one or more of the project maintainers.
12
+
13
+ This Code of Conduct is adapted from the [Contributor Covenant](http://contributor-covenant.org), version 1.0.0, available at [http://contributor-covenant.org/version/1/0/0/](http://contributor-covenant.org/version/1/0/0/)
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in kusari.gemspec
4
+ gemspec
@@ -0,0 +1,61 @@
1
+ # :link: Kusari
2
+
3
+ Japanese random sentence generator based on Markov chain.
4
+
5
+ ## Installation
6
+
7
+ Add this line to your application's Gemfile:
8
+
9
+ ```ruby
10
+ gem 'kusari'
11
+ ```
12
+
13
+ And then execute:
14
+
15
+ $ bundle
16
+
17
+ Or install it yourself as:
18
+
19
+ $ gem install kusari
20
+
21
+ ## Usage
22
+
23
+ First of all, our application must load the gem and create a new instance as:
24
+
25
+ ```ruby
26
+ require 'kusari'
27
+ generator = Kusari::Generator.new
28
+ # by default, the above statement is the same as:
29
+ # generator = Kusari::Generator.new(3, "./ipadic")
30
+ ```
31
+
32
+ Note that the first argument `3` indicates N for the N-gram model used by creating tokenized word table. You can give arbitrary number. And the second one `./ipadic` tells the path of [IPA dictionary](http://taku910.github.io/mecab/#download), a dictionary for parsing Japanese strings, to the generator.
33
+
34
+ Next, adding strings (reference sentences for Markov chain) can be done by:
35
+
36
+ ```ruby
37
+ generator.add_string("ネロとパトラッシュは、この世で二人きりでした。")
38
+ generator.add_string("彼らは、実の兄弟よりも仲のよい大の親友でした。")
39
+ generator.add_string("ネロは、アルデンネ生まれの少年でした。")
40
+ ```
41
+
42
+ Finally, we can obtain randomly generated sentence as:
43
+
44
+ ```ruby
45
+ sentence = generator.generate(140)
46
+ p sentence
47
+ # => "ネロは、アルデンネ生まれの兄弟よりも仲のよい大の少年でした。"
48
+ ```
49
+
50
+ Here, an argument of the generate method defines limit length for the generated sentence; `generator.generate(140)` creates a sentence which can be posted on Twitter, for example.
51
+
52
+ ## Development
53
+
54
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
55
+
56
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
57
+
58
+ ## Contributing
59
+
60
+ Bug reports and pull requests are welcome on GitHub at https://github.com/takuti/kusari. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](contributor-covenant.org) code of conduct.
61
+
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "kusari"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start
@@ -0,0 +1,7 @@
1
+ #!/bin/bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+
5
+ bundle install
6
+
7
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,26 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'kusari/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "kusari"
8
+ spec.version = Kusari::VERSION
9
+ spec.license = "MIT"
10
+ spec.authors = ["takuti"]
11
+ spec.email = ["k.takuti@gmail.com"]
12
+
13
+ spec.summary = %q{Japanese random sentence generator based on Markov chain.}
14
+ spec.homepage = "https://github.com/takuti/kusari"
15
+
16
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
17
+ spec.bindir = "exe"
18
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
19
+ spec.require_paths = ["lib"]
20
+
21
+ spec.add_dependency "igo-ruby", "~> 0.1.5"
22
+
23
+ spec.add_development_dependency "bundler", "~> 1.10"
24
+ spec.add_development_dependency "rake", "~> 10.0"
25
+ spec.add_development_dependency "rspec"
26
+ end
@@ -0,0 +1,18 @@
1
+ require "kusari/markov_sentence_generator"
2
+ require "kusari/version"
3
+
4
+ module Kusari
5
+ class Generator
6
+ def initialize(gram=3, ipadic_path="./ipadic")
7
+ @generator = MarkovSentenceGenerator.new(gram, ipadic_path)
8
+ end
9
+
10
+ def add_string(string)
11
+ @generator.add(string)
12
+ end
13
+
14
+ def generate(limit)
15
+ @generator.generate(limit)
16
+ end
17
+ end
18
+ end
@@ -0,0 +1,80 @@
1
+ # coding: utf-8
2
+
3
+ require "igo-ruby"
4
+
5
+ class MarkovSentenceGenerator
6
+ HEAD = "[HEAD]"
7
+ TAIL = "[TAIL]"
8
+
9
+ def initialize(gram=3, ipadic_path="./ipadic")
10
+ @gram = gram
11
+
12
+ # Japanese tokenizer
13
+ @tagger = Igo::Tagger.new(ipadic_path)
14
+
15
+ # save arrays of tokenized words based on the N-gram model
16
+ @markov_table = Array.new
17
+ end
18
+
19
+ def tokenize(string)
20
+ tokens = Array.new
21
+ tokens << HEAD
22
+ tokens += @tagger.wakati(string)
23
+ tokens << TAIL
24
+ end
25
+
26
+ def add(string)
27
+ tokens = tokenize(string)
28
+
29
+ # if there are at least 4 tokens, we can create both of HEAD-started and TAIL-ended array of words
30
+ return if tokens.size < 4
31
+
32
+ # update markov_table
33
+ i = 0
34
+ loop do
35
+ @markov_table << tokens[i..(i+@gram-1)]
36
+ break if tokens[i+@gram-1] == TAIL
37
+ i += 1
38
+ end
39
+ end
40
+
41
+ def generate(limit)
42
+ # select all HEAD-started arrays
43
+ head_arrays = Array.new
44
+ @markov_table.each do |markov_array|
45
+ if markov_array[0] == HEAD
46
+ head_arrays << markov_array
47
+ end
48
+ end
49
+
50
+ # sample one HEAD-started array and create initial sentence based on that
51
+ sampled_array = head_arrays.sample
52
+ sentence = sampled_array[1] + sampled_array[2]
53
+
54
+ # start Markov chain until getting the TAIL flag
55
+ loop do
56
+ # select all arrays which can chain their head word to current tail of the sentence
57
+ chain_arrays = Array.new
58
+ @markov_table.each do |markov_array|
59
+ if markov_array[0] == sampled_array[2]
60
+ chain_arrays << markov_array
61
+ end
62
+ end
63
+
64
+ # finish here if we cannot continue to chain
65
+ break if chain_arrays.length == 0
66
+
67
+ # grow current sentence and check if it has the TAIL flag
68
+ sampled_array = chain_arrays.sample
69
+ if sampled_array[2] == TAIL
70
+ sentence += sampled_array[1]
71
+ break
72
+ else
73
+ concat_string = sampled_array[1] + sampled_array[2]
74
+ break if sentence.length + concat_string.length > limit
75
+ sentence += concat_string
76
+ end
77
+ end
78
+ sentence
79
+ end
80
+ end
@@ -0,0 +1,3 @@
1
+ module Kusari
2
+ VERSION = "0.1.0"
3
+ end
metadata ADDED
@@ -0,0 +1,113 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: kusari
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - takuti
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2015-12-09 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: igo-ruby
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: 0.1.5
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: 0.1.5
27
+ - !ruby/object:Gem::Dependency
28
+ name: bundler
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '1.10'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '1.10'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rake
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '10.0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '10.0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rspec
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ description:
70
+ email:
71
+ - k.takuti@gmail.com
72
+ executables: []
73
+ extensions: []
74
+ extra_rdoc_files: []
75
+ files:
76
+ - ".gitignore"
77
+ - ".rspec"
78
+ - ".travis.yml"
79
+ - CODE_OF_CONDUCT.md
80
+ - Gemfile
81
+ - README.md
82
+ - Rakefile
83
+ - bin/console
84
+ - bin/setup
85
+ - kusari.gemspec
86
+ - lib/kusari.rb
87
+ - lib/kusari/markov_sentence_generator.rb
88
+ - lib/kusari/version.rb
89
+ homepage: https://github.com/takuti/kusari
90
+ licenses:
91
+ - MIT
92
+ metadata: {}
93
+ post_install_message:
94
+ rdoc_options: []
95
+ require_paths:
96
+ - lib
97
+ required_ruby_version: !ruby/object:Gem::Requirement
98
+ requirements:
99
+ - - ">="
100
+ - !ruby/object:Gem::Version
101
+ version: '0'
102
+ required_rubygems_version: !ruby/object:Gem::Requirement
103
+ requirements:
104
+ - - ">="
105
+ - !ruby/object:Gem::Version
106
+ version: '0'
107
+ requirements: []
108
+ rubyforge_project:
109
+ rubygems_version: 2.4.5.1
110
+ signing_key:
111
+ specification_version: 4
112
+ summary: Japanese random sentence generator based on Markov chain.
113
+ test_files: []