ipsumizer 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: '084cbabf13cf81e3d22e97c28eeab3d7f7f333a3'
4
+ data.tar.gz: e297906ad93aa5b31392801cbe43d089c2cfd28d
5
+ SHA512:
6
+ metadata.gz: f38a89bc38a7fe5c7ede90f706ca58a9ca4f05e5ad965427783283e9293c26ca390603c43767f7c54549ef384fa38a5cecbe9bdee01d17f9efae598ffa8408de
7
+ data.tar.gz: 5a0fb3fb60e1ac1c0b7c88d7b5f9c1f6342976b174153164aee39ea61f4faa74764e2e8f1c502d56af4fb4299e83802fc380b9a5f859d99ad5fa78cd1aa086c1
@@ -0,0 +1,9 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
@@ -0,0 +1,4 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.0.0
4
+ before_install: gem install bundler -v 1.10.6
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in ipsumizer.gemspec
4
+ gemspec
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2017 dfhoughton
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,141 @@
1
+ # Ipsumizer
2
+
3
+ `Ipsumizer` generates random sentences based on the sample text it is initialized with.
4
+
5
+ ## Synopsis
6
+
7
+ ```ruby
8
+ require 'ipsumizer'
9
+
10
+ # sample text -- the Aeneid (only first lines shown)
11
+ text = <<-END
12
+ Arma virumque cano, Troiae qui primus ab oris
13
+ Italiam, fato profugus, Laviniaque venit
14
+ litora, multum ille et terris iactatus et alto
15
+ vi superum saevae memorem Iunonis ob iram;
16
+ multa quoque et bello passus, dum conderet urbem,
17
+ inferretque deos Latio, genus unde Latinum,
18
+ Albanique patres, atque altae moenia Romae.
19
+
20
+ Musa, mihi causas memora, quo numine laeso,
21
+ quidve dolens, regina deum tot volvere casus
22
+ insignem pietate virum, tot adire labores
23
+ impulerit. Tantaene animis caelestibus irae?
24
+
25
+ ...
26
+
27
+ END
28
+
29
+ ip = Ipsumizer.new text
30
+
31
+ ip.speak # "Illi ingere glaebae; faciat tempora magno miserae lustri non data dum excutitur, dona Cupido Tyrius artit, et prodimus urbine Byrsam, neque, Troesque corpora mortalis, foret Troiaque qui taliam rabili portuna viros inhumati unda dextra, donis?"
32
+ ip.speak # "'Rex erat pectore pontus, nostrata pelago pectore tot captas epulis Achates et patres, haere duce reconderat, ut supplexu Aeneas, Tyria fluctus antiquentem tum excidio Libyae: sic ore cadis agminem."
33
+ ip.speak # "Adsit lacrimisque matrem arrectisque ruunt et alas et placidam Iunonis?"
34
+
35
+ ip = Ipsumizer.new text, prefix: 2
36
+
37
+ ip.speak # "Hissalia ma cubem gerefix, Iuripsilis ocubvolut ciantensii, iturue perbata gerecurore Iulcelacriscenthomasumin lo.]"
38
+
39
+ ```
40
+
41
+ ## Description
42
+
43
+ `Ipsumizer` builds a "character language model" based on the sample *sentences* you give it. This means it discovers the probability
44
+ that a particular letter follows particular sequences of preceding letters within a sentence. "Letters" in this case include the
45
+ beginning of the sentence and the end. Given this information it can build a new sentence like so:
46
+
47
+ 1. Pick the starting character based on the frequence of starting letters in the sample sentences
48
+ 2. Pick the next character based on the frequency of second characters given the first
49
+ 3. Pick the third character based on the frequency of the preceding two
50
+ 4. etc.
51
+
52
+ Once it reaches its "prefix" limit, it trims the preceding sequence to this length. The longer the sequence, the less creative
53
+ `Ipsumizer` will be and the more the generated text will resemble the sample. At some length it will stop generating any
54
+ novel sentences and will simply return some random sentence from the sample it ingested.
55
+
56
+ ### Sentencing
57
+
58
+ If you give `Ipsumizer` an array as the first argument to its initializer it will assume these are the sample sentences.
59
+ Alternatively, if you give it a `String` it will use a regular expression to splits this string into sample sentences.
60
+ As is generally the case with regular expression parsing, this won't always be as sophisticated as you might like. You can
61
+ either do the sentencing yourself beforehand or provide your own sentencing regexp, like so:
62
+
63
+ ```ruby
64
+ ip = Ipsumizer.new text, sentencer: /([.?!])/
65
+ ```
66
+
67
+ Note the capturing group around the expression. `Ipsumizer` assumes sentencing expressions capture their separators. It
68
+ will look for the these in the `split` output, therefore, and glue them back onto their sentences.
69
+
70
+ ### Normalization
71
+
72
+ The only normalization `Ipsumizer` provides by default is the stripping of whitespace and the conversion of all
73
+ internal whitespace into ' '. If you want to remove macrons, for example, you need to do it to the text yourself.
74
+
75
+ ## Methods
76
+
77
+ ### `initialize(text, sentencer: DEFAULT_SENTENCER, prefix: DEFAULT_PREFIX)`
78
+
79
+ The `text` parameter is either a string or an array of strings. In the former case, the string will be
80
+ split into sentences using the sentencer pattern. The `prefix` parameter is a non-negative integer. The bigger
81
+ the prefix, the more faithful generated sentences will be to the original.
82
+
83
+ ### `speak`
84
+
85
+ Generate a random sentence.
86
+
87
+ ### `sentence(text)`
88
+
89
+ Split the `text` parameter into sentences using the `Ipsumizer`'s sentence boundary pattern.
90
+
91
+ ### `sentencer`
92
+
93
+ Accessor for the sentencer pattern.
94
+
95
+ ### `prefix`
96
+
97
+ Accessor for the prefix length.
98
+
99
+ ## Defaults and Constants
100
+
101
+ ```ruby
102
+ DEFAULT_SENTENCER = Regexp.new %r{([.!?][.!?\p{Final_Punctuation}\p{Close_Punctuation}"\s]*)}
103
+
104
+ DEFAULT_PREFIX = 4
105
+ ```
106
+
107
+ ## Installation
108
+
109
+ Add this line to your application's Gemfile:
110
+
111
+ ```ruby
112
+ gem 'ipsumizer'
113
+ ```
114
+
115
+ And then execute:
116
+
117
+ $ bundle
118
+
119
+ Or install it yourself as:
120
+
121
+ $ gem install ipsumizer
122
+
123
+ ## Usage
124
+
125
+ TODO: Write usage instructions here
126
+
127
+ ## Development
128
+
129
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
130
+
131
+ To install this gem onto your local machine, run `bundle exec rake install`.
132
+
133
+ ## Contributing
134
+
135
+ Bug reports and pull requests are welcome on GitHub at https://github.com/dfhoughton/ipsumizer.
136
+
137
+
138
+ ## License
139
+
140
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
141
+
@@ -0,0 +1,10 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+
4
+ Rake::TestTask.new(:test) do |t|
5
+ t.libs << "test"
6
+ t.libs << "lib"
7
+ t.test_files = FileList['test/**/*_test.rb']
8
+ end
9
+
10
+ task :default => :test
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "ipsumizer"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start
@@ -0,0 +1,7 @@
1
+ #!/bin/bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+
5
+ bundle install
6
+
7
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,26 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'ipsumizer/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "ipsumizer"
8
+ spec.version = Ipsumizer::VERSION
9
+ spec.authors = ["dfhoughton"]
10
+ spec.email = ["dfhoughton@gmail.com"]
11
+
12
+ spec.summary = 'Generate lorem ipsum text from a text sample.'
13
+ spec.description = spec.summary
14
+ spec.homepage = "https://github.com/dfhoughton/ipsumizer"
15
+ spec.license = "MIT"
16
+
17
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
18
+ spec.bindir = "exe"
19
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
20
+ spec.require_paths = ["lib"]
21
+
22
+ spec.add_development_dependency "bundler", "~> 1.10"
23
+ spec.add_development_dependency "rake", "~> 10.0"
24
+ spec.add_development_dependency "minitest", "~> 5"
25
+ spec.add_development_dependency "byebug", "~> 0"
26
+ end
@@ -0,0 +1,99 @@
1
+ require "ipsumizer/version"
2
+
3
+ class Ipsumizer
4
+ attr_reader :prefix, :sentencer
5
+
6
+ DEFAULT_SENTENCER = Regexp.new %r{([.!?][.!?\p{Final_Punctuation}\p{Close_Punctuation}"\s]*)}
7
+
8
+ DEFAULT_PREFIX = 4
9
+
10
+ def initialize( sentences, prefix: DEFAULT_PREFIX, sentencer: DEFAULT_SENTENCER )
11
+ @prefix = prefix.to_i
12
+ fail "prefix length must be non-negative: #{prefix}" unless prefix > 0
13
+ @sentencer = sentencer
14
+ @transitions = {}
15
+ if sentences.is_a? String
16
+ sentences = sentence sentences
17
+ end
18
+ sentences = sentences.map{ |s| s.strip.gsub /\s/, ' ' }.select{ |s| s =~ /\S/ }
19
+ fail "no sentences" unless sentences.any?
20
+ sentences.each do |s|
21
+ key = ''
22
+ i = 0
23
+ (0..s.length).each do |i|
24
+ nxt = if i == s.length
25
+ nil
26
+ else
27
+ s[i]
28
+ end
29
+ counts = @transitions[key] ||= {}
30
+ counts[nxt] = counts[nxt].to_i + 1
31
+ if nxt
32
+ key += nxt
33
+ if key.length > prefix
34
+ key = key[1..-1]
35
+ end
36
+ end
37
+ end
38
+ end
39
+ @transitions.each do |pfx, counts|
40
+ total = counts.values.reduce(:+)
41
+ probabilities = counts.values.map{ |n| n.to_r / total }
42
+ @transitions[pfx] = AliasTable.new( counts.keys, probabilities )
43
+ end
44
+ end
45
+
46
+ # split a text into sentences, where sentences are separated by on of '.', '!', and '?', optionally preceded by
47
+ # double quotes or closing brackets and so forth
48
+ def sentence(text)
49
+ text.split(sentencer).each_slice(2).map{ |*bits| bits.join }.select{ |s| s =~ /\S/ }
50
+ end
51
+
52
+ # make a random sentence
53
+ def speak
54
+ pfx = s = ''
55
+ loop do
56
+ nxt = @transitions[pfx]&.generate
57
+ break unless nxt
58
+ s += nxt
59
+ pfx += nxt
60
+ if pfx.length > prefix
61
+ pfx = pfx[1..-1]
62
+ end
63
+ end
64
+ s
65
+ end
66
+
67
+ # copied in here and fixed because the gem was in a broken state
68
+ class AliasTable
69
+ def initialize(x_set, p_value)
70
+ @p_primary = p_value.map(&:to_r)
71
+ @x = x_set.clone.freeze
72
+ @alias = Array.new(@x.length)
73
+ parity = Rational(1, @x.length)
74
+ group = @p_primary.each_index.group_by { |i| @p_primary[i] <=> parity }
75
+ parity_set = group.fetch(0, [])
76
+ parity_set.each { |i| @p_primary[i] = Rational(1) }
77
+ deficit_set = group.fetch(-1, [])
78
+ surplus_set = group.fetch(1, [])
79
+ until deficit_set.empty?
80
+ deficit = deficit_set.pop
81
+ surplus = surplus_set.pop
82
+ @p_primary[surplus] -= parity - @p_primary[deficit]
83
+ @p_primary[deficit] /= parity
84
+ @alias[deficit] = @x[surplus]
85
+ if @p_primary[surplus] == parity
86
+ @p_primary[surplus] = Rational(1)
87
+ else
88
+ (@p_primary[surplus] < parity ? deficit_set : surplus_set) << surplus
89
+ end
90
+ end
91
+ end
92
+
93
+ def generate
94
+ column = rand(@x.length)
95
+ rand <= @p_primary[column] ? @x[column] : @alias[column]
96
+ end
97
+ end
98
+
99
+ end
@@ -0,0 +1,3 @@
1
+ class Ipsumizer
2
+ VERSION = "0.1.0"
3
+ end
metadata ADDED
@@ -0,0 +1,111 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: ipsumizer
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - dfhoughton
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2017-12-11 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.10'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.10'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: minitest
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '5'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '5'
55
+ - !ruby/object:Gem::Dependency
56
+ name: byebug
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ description: Generate lorem ipsum text from a text sample.
70
+ email:
71
+ - dfhoughton@gmail.com
72
+ executables: []
73
+ extensions: []
74
+ extra_rdoc_files: []
75
+ files:
76
+ - ".gitignore"
77
+ - ".travis.yml"
78
+ - Gemfile
79
+ - LICENSE.txt
80
+ - README.md
81
+ - Rakefile
82
+ - bin/console
83
+ - bin/setup
84
+ - ipsumizer.gemspec
85
+ - lib/ipsumizer.rb
86
+ - lib/ipsumizer/version.rb
87
+ homepage: https://github.com/dfhoughton/ipsumizer
88
+ licenses:
89
+ - MIT
90
+ metadata: {}
91
+ post_install_message:
92
+ rdoc_options: []
93
+ require_paths:
94
+ - lib
95
+ required_ruby_version: !ruby/object:Gem::Requirement
96
+ requirements:
97
+ - - ">="
98
+ - !ruby/object:Gem::Version
99
+ version: '0'
100
+ required_rubygems_version: !ruby/object:Gem::Requirement
101
+ requirements:
102
+ - - ">="
103
+ - !ruby/object:Gem::Version
104
+ version: '0'
105
+ requirements: []
106
+ rubyforge_project:
107
+ rubygems_version: 2.6.13
108
+ signing_key:
109
+ specification_version: 4
110
+ summary: Generate lorem ipsum text from a text sample.
111
+ test_files: []