ipsumizer 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +9 -0
- data/.travis.yml +4 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +21 -0
- data/README.md +141 -0
- data/Rakefile +10 -0
- data/bin/console +14 -0
- data/bin/setup +7 -0
- data/ipsumizer.gemspec +26 -0
- data/lib/ipsumizer.rb +99 -0
- data/lib/ipsumizer/version.rb +3 -0
- metadata +111 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: '084cbabf13cf81e3d22e97c28eeab3d7f7f333a3'
|
4
|
+
data.tar.gz: e297906ad93aa5b31392801cbe43d089c2cfd28d
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: f38a89bc38a7fe5c7ede90f706ca58a9ca4f05e5ad965427783283e9293c26ca390603c43767f7c54549ef384fa38a5cecbe9bdee01d17f9efae598ffa8408de
|
7
|
+
data.tar.gz: 5a0fb3fb60e1ac1c0b7c88d7b5f9c1f6342976b174153164aee39ea61f4faa74764e2e8f1c502d56af4fb4299e83802fc380b9a5f859d99ad5fa78cd1aa086c1
|
data/.gitignore
ADDED
data/.travis.yml
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2017 dfhoughton
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,141 @@
|
|
1
|
+
# Ipsumizer
|
2
|
+
|
3
|
+
`Ipsumizer` generates random sentences based on the sample text it is initialized with.
|
4
|
+
|
5
|
+
## Synopsis
|
6
|
+
|
7
|
+
```ruby
|
8
|
+
require 'ipsumizer'
|
9
|
+
|
10
|
+
# sample text -- the Aeneid (only first lines shown)
|
11
|
+
text = <<-END
|
12
|
+
Arma virumque cano, Troiae qui primus ab oris
|
13
|
+
Italiam, fato profugus, Laviniaque venit
|
14
|
+
litora, multum ille et terris iactatus et alto
|
15
|
+
vi superum saevae memorem Iunonis ob iram;
|
16
|
+
multa quoque et bello passus, dum conderet urbem,
|
17
|
+
inferretque deos Latio, genus unde Latinum,
|
18
|
+
Albanique patres, atque altae moenia Romae.
|
19
|
+
|
20
|
+
Musa, mihi causas memora, quo numine laeso,
|
21
|
+
quidve dolens, regina deum tot volvere casus
|
22
|
+
insignem pietate virum, tot adire labores
|
23
|
+
impulerit. Tantaene animis caelestibus irae?
|
24
|
+
|
25
|
+
...
|
26
|
+
|
27
|
+
END
|
28
|
+
|
29
|
+
ip = Ipsumizer.new text
|
30
|
+
|
31
|
+
ip.speak # "Illi ingere glaebae; faciat tempora magno miserae lustri non data dum excutitur, dona Cupido Tyrius artit, et prodimus urbine Byrsam, neque, Troesque corpora mortalis, foret Troiaque qui taliam rabili portuna viros inhumati unda dextra, donis?"
|
32
|
+
ip.speak # "'Rex erat pectore pontus, nostrata pelago pectore tot captas epulis Achates et patres, haere duce reconderat, ut supplexu Aeneas, Tyria fluctus antiquentem tum excidio Libyae: sic ore cadis agminem."
|
33
|
+
ip.speak # "Adsit lacrimisque matrem arrectisque ruunt et alas et placidam Iunonis?"
|
34
|
+
|
35
|
+
ip = Ipsumizer.new text, prefix: 2
|
36
|
+
|
37
|
+
ip.speak # "Hissalia ma cubem gerefix, Iuripsilis ocubvolut ciantensii, iturue perbata gerecurore Iulcelacriscenthomasumin lo.]"
|
38
|
+
|
39
|
+
```
|
40
|
+
|
41
|
+
## Description
|
42
|
+
|
43
|
+
`Ipsumizer` builds a "character language model" based on the sample *sentences* you give it. This means it discovers the probability
|
44
|
+
that a particular letter follows particular sequences of preceding letters within a sentence. "Letters" in this case include the
|
45
|
+
beginning of the sentence and the end. Given this information it can build a new sentence like so:
|
46
|
+
|
47
|
+
1. Pick the starting character based on the frequence of starting letters in the sample sentences
|
48
|
+
2. Pick the next character based on the frequency of second characters given the first
|
49
|
+
3. Pick the third character based on the frequency of the preceding two
|
50
|
+
4. etc.
|
51
|
+
|
52
|
+
Once it reaches its "prefix" limit, it trims the preceding sequence to this length. The longer the sequence, the less creative
|
53
|
+
`Ipsumizer` will be and the more the generated text will resemble the sample. At some length it will stop generating any
|
54
|
+
novel sentences and will simply return some random sentence from the sample it ingested.
|
55
|
+
|
56
|
+
### Sentencing
|
57
|
+
|
58
|
+
If you give `Ipsumizer` an array as the first argument to its initializer it will assume these are the sample sentences.
|
59
|
+
Alternatively, if you give it a `String` it will use a regular expression to splits this string into sample sentences.
|
60
|
+
As is generally the case with regular expression parsing, this won't always be as sophisticated as you might like. You can
|
61
|
+
either do the sentencing yourself beforehand or provide your own sentencing regexp, like so:
|
62
|
+
|
63
|
+
```ruby
|
64
|
+
ip = Ipsumizer.new text, sentencer: /([.?!])/
|
65
|
+
```
|
66
|
+
|
67
|
+
Note the capturing group around the expression. `Ipsumizer` assumes sentencing expressions capture their separators. It
|
68
|
+
will look for the these in the `split` output, therefore, and glue them back onto their sentences.
|
69
|
+
|
70
|
+
### Normalization
|
71
|
+
|
72
|
+
The only normalization `Ipsumizer` provides by default is the stripping of whitespace and the conversion of all
|
73
|
+
internal whitespace into ' '. If you want to remove macrons, for example, you need to do it to the text yourself.
|
74
|
+
|
75
|
+
## Methods
|
76
|
+
|
77
|
+
### `initialize(text, sentencer: DEFAULT_SENTENCER, prefix: DEFAULT_PREFIX)`
|
78
|
+
|
79
|
+
The `text` parameter is either a string or an array of strings. In the former case, the string will be
|
80
|
+
split into sentences using the sentencer pattern. The `prefix` parameter is a non-negative integer. The bigger
|
81
|
+
the prefix, the more faithful generated sentences will be to the original.
|
82
|
+
|
83
|
+
### `speak`
|
84
|
+
|
85
|
+
Generate a random sentence.
|
86
|
+
|
87
|
+
### `sentence(text)`
|
88
|
+
|
89
|
+
Split the `text` parameter into sentences using the `Ipsumizer`'s sentence boundary pattern.
|
90
|
+
|
91
|
+
### `sentencer`
|
92
|
+
|
93
|
+
Accessor for the sentencer pattern.
|
94
|
+
|
95
|
+
### `prefix`
|
96
|
+
|
97
|
+
Accessor for the prefix length.
|
98
|
+
|
99
|
+
## Defaults and Constants
|
100
|
+
|
101
|
+
```ruby
|
102
|
+
DEFAULT_SENTENCER = Regexp.new %r{([.!?][.!?\p{Final_Punctuation}\p{Close_Punctuation}"\s]*)}
|
103
|
+
|
104
|
+
DEFAULT_PREFIX = 4
|
105
|
+
```
|
106
|
+
|
107
|
+
## Installation
|
108
|
+
|
109
|
+
Add this line to your application's Gemfile:
|
110
|
+
|
111
|
+
```ruby
|
112
|
+
gem 'ipsumizer'
|
113
|
+
```
|
114
|
+
|
115
|
+
And then execute:
|
116
|
+
|
117
|
+
$ bundle
|
118
|
+
|
119
|
+
Or install it yourself as:
|
120
|
+
|
121
|
+
$ gem install ipsumizer
|
122
|
+
|
123
|
+
## Usage
|
124
|
+
|
125
|
+
TODO: Write usage instructions here
|
126
|
+
|
127
|
+
## Development
|
128
|
+
|
129
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
130
|
+
|
131
|
+
To install this gem onto your local machine, run `bundle exec rake install`.
|
132
|
+
|
133
|
+
## Contributing
|
134
|
+
|
135
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/dfhoughton/ipsumizer.
|
136
|
+
|
137
|
+
|
138
|
+
## License
|
139
|
+
|
140
|
+
The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
|
141
|
+
|
data/Rakefile
ADDED
data/bin/console
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require "bundler/setup"
|
4
|
+
require "ipsumizer"
|
5
|
+
|
6
|
+
# You can add fixtures and/or initialization code here to make experimenting
|
7
|
+
# with your gem easier. You can also use a different console, if you like.
|
8
|
+
|
9
|
+
# (If you use this, don't forget to add pry to your Gemfile!)
|
10
|
+
# require "pry"
|
11
|
+
# Pry.start
|
12
|
+
|
13
|
+
require "irb"
|
14
|
+
IRB.start
|
data/bin/setup
ADDED
data/ipsumizer.gemspec
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require 'ipsumizer/version'
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "ipsumizer"
|
8
|
+
spec.version = Ipsumizer::VERSION
|
9
|
+
spec.authors = ["dfhoughton"]
|
10
|
+
spec.email = ["dfhoughton@gmail.com"]
|
11
|
+
|
12
|
+
spec.summary = 'Generate lorem ipsum text from a text sample.'
|
13
|
+
spec.description = spec.summary
|
14
|
+
spec.homepage = "https://github.com/dfhoughton/ipsumizer"
|
15
|
+
spec.license = "MIT"
|
16
|
+
|
17
|
+
spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
|
18
|
+
spec.bindir = "exe"
|
19
|
+
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
20
|
+
spec.require_paths = ["lib"]
|
21
|
+
|
22
|
+
spec.add_development_dependency "bundler", "~> 1.10"
|
23
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
24
|
+
spec.add_development_dependency "minitest", "~> 5"
|
25
|
+
spec.add_development_dependency "byebug", "~> 0"
|
26
|
+
end
|
data/lib/ipsumizer.rb
ADDED
@@ -0,0 +1,99 @@
|
|
1
|
+
require "ipsumizer/version"
|
2
|
+
|
3
|
+
class Ipsumizer
|
4
|
+
attr_reader :prefix, :sentencer
|
5
|
+
|
6
|
+
DEFAULT_SENTENCER = Regexp.new %r{([.!?][.!?\p{Final_Punctuation}\p{Close_Punctuation}"\s]*)}
|
7
|
+
|
8
|
+
DEFAULT_PREFIX = 4
|
9
|
+
|
10
|
+
def initialize( sentences, prefix: DEFAULT_PREFIX, sentencer: DEFAULT_SENTENCER )
|
11
|
+
@prefix = prefix.to_i
|
12
|
+
fail "prefix length must be non-negative: #{prefix}" unless prefix > 0
|
13
|
+
@sentencer = sentencer
|
14
|
+
@transitions = {}
|
15
|
+
if sentences.is_a? String
|
16
|
+
sentences = sentence sentences
|
17
|
+
end
|
18
|
+
sentences = sentences.map{ |s| s.strip.gsub /\s/, ' ' }.select{ |s| s =~ /\S/ }
|
19
|
+
fail "no sentences" unless sentences.any?
|
20
|
+
sentences.each do |s|
|
21
|
+
key = ''
|
22
|
+
i = 0
|
23
|
+
(0..s.length).each do |i|
|
24
|
+
nxt = if i == s.length
|
25
|
+
nil
|
26
|
+
else
|
27
|
+
s[i]
|
28
|
+
end
|
29
|
+
counts = @transitions[key] ||= {}
|
30
|
+
counts[nxt] = counts[nxt].to_i + 1
|
31
|
+
if nxt
|
32
|
+
key += nxt
|
33
|
+
if key.length > prefix
|
34
|
+
key = key[1..-1]
|
35
|
+
end
|
36
|
+
end
|
37
|
+
end
|
38
|
+
end
|
39
|
+
@transitions.each do |pfx, counts|
|
40
|
+
total = counts.values.reduce(:+)
|
41
|
+
probabilities = counts.values.map{ |n| n.to_r / total }
|
42
|
+
@transitions[pfx] = AliasTable.new( counts.keys, probabilities )
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
# split a text into sentences, where sentences are separated by on of '.', '!', and '?', optionally preceded by
|
47
|
+
# double quotes or closing brackets and so forth
|
48
|
+
def sentence(text)
|
49
|
+
text.split(sentencer).each_slice(2).map{ |*bits| bits.join }.select{ |s| s =~ /\S/ }
|
50
|
+
end
|
51
|
+
|
52
|
+
# make a random sentence
|
53
|
+
def speak
|
54
|
+
pfx = s = ''
|
55
|
+
loop do
|
56
|
+
nxt = @transitions[pfx]&.generate
|
57
|
+
break unless nxt
|
58
|
+
s += nxt
|
59
|
+
pfx += nxt
|
60
|
+
if pfx.length > prefix
|
61
|
+
pfx = pfx[1..-1]
|
62
|
+
end
|
63
|
+
end
|
64
|
+
s
|
65
|
+
end
|
66
|
+
|
67
|
+
# copied in here and fixed because the gem was in a broken state
|
68
|
+
class AliasTable
|
69
|
+
def initialize(x_set, p_value)
|
70
|
+
@p_primary = p_value.map(&:to_r)
|
71
|
+
@x = x_set.clone.freeze
|
72
|
+
@alias = Array.new(@x.length)
|
73
|
+
parity = Rational(1, @x.length)
|
74
|
+
group = @p_primary.each_index.group_by { |i| @p_primary[i] <=> parity }
|
75
|
+
parity_set = group.fetch(0, [])
|
76
|
+
parity_set.each { |i| @p_primary[i] = Rational(1) }
|
77
|
+
deficit_set = group.fetch(-1, [])
|
78
|
+
surplus_set = group.fetch(1, [])
|
79
|
+
until deficit_set.empty?
|
80
|
+
deficit = deficit_set.pop
|
81
|
+
surplus = surplus_set.pop
|
82
|
+
@p_primary[surplus] -= parity - @p_primary[deficit]
|
83
|
+
@p_primary[deficit] /= parity
|
84
|
+
@alias[deficit] = @x[surplus]
|
85
|
+
if @p_primary[surplus] == parity
|
86
|
+
@p_primary[surplus] = Rational(1)
|
87
|
+
else
|
88
|
+
(@p_primary[surplus] < parity ? deficit_set : surplus_set) << surplus
|
89
|
+
end
|
90
|
+
end
|
91
|
+
end
|
92
|
+
|
93
|
+
def generate
|
94
|
+
column = rand(@x.length)
|
95
|
+
rand <= @p_primary[column] ? @x[column] : @alias[column]
|
96
|
+
end
|
97
|
+
end
|
98
|
+
|
99
|
+
end
|
metadata
ADDED
@@ -0,0 +1,111 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: ipsumizer
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- dfhoughton
|
8
|
+
autorequire:
|
9
|
+
bindir: exe
|
10
|
+
cert_chain: []
|
11
|
+
date: 2017-12-11 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.10'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.10'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rake
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '10.0'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '10.0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: minitest
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '5'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '5'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: byebug
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '0'
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '0'
|
69
|
+
description: Generate lorem ipsum text from a text sample.
|
70
|
+
email:
|
71
|
+
- dfhoughton@gmail.com
|
72
|
+
executables: []
|
73
|
+
extensions: []
|
74
|
+
extra_rdoc_files: []
|
75
|
+
files:
|
76
|
+
- ".gitignore"
|
77
|
+
- ".travis.yml"
|
78
|
+
- Gemfile
|
79
|
+
- LICENSE.txt
|
80
|
+
- README.md
|
81
|
+
- Rakefile
|
82
|
+
- bin/console
|
83
|
+
- bin/setup
|
84
|
+
- ipsumizer.gemspec
|
85
|
+
- lib/ipsumizer.rb
|
86
|
+
- lib/ipsumizer/version.rb
|
87
|
+
homepage: https://github.com/dfhoughton/ipsumizer
|
88
|
+
licenses:
|
89
|
+
- MIT
|
90
|
+
metadata: {}
|
91
|
+
post_install_message:
|
92
|
+
rdoc_options: []
|
93
|
+
require_paths:
|
94
|
+
- lib
|
95
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
96
|
+
requirements:
|
97
|
+
- - ">="
|
98
|
+
- !ruby/object:Gem::Version
|
99
|
+
version: '0'
|
100
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
101
|
+
requirements:
|
102
|
+
- - ">="
|
103
|
+
- !ruby/object:Gem::Version
|
104
|
+
version: '0'
|
105
|
+
requirements: []
|
106
|
+
rubyforge_project:
|
107
|
+
rubygems_version: 2.6.13
|
108
|
+
signing_key:
|
109
|
+
specification_version: 4
|
110
|
+
summary: Generate lorem ipsum text from a text sample.
|
111
|
+
test_files: []
|