magic_cloud 0.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: b7ed426c54152c66311804b851d214ff6973a37d
4
+ data.tar.gz: 45354e6f6d66aa759bb14ddf19c273b3f03a9687
5
+ SHA512:
6
+ metadata.gz: 85a631a786268da87479796fa4b61e40e0ed043da562662717f92c120484321a362d37baf231eece4790b629cf5a0c76ed1aebc7dd4825d7ecf675353dea0d21
7
+ data.tar.gz: 3e497cb23c8068ab2ac1761e6298c6e6ffa649b0776df1fe113224c682e841e7c1653800f49cf3a9dfaf130fe07e6921212d2fa1285acfe0111d88e73a50e786
data/LICENSE.txt ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2014-15 Victor 'Zverok' Shepelev
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,142 @@
1
+ MagicCloud - simple pretty word cloud for Ruby
2
+ ==============================================
3
+
4
+ **MagicCloud** is simple, pure-ruby library for making pretty
5
+ [Wordle](http://www.wordle.net/)-like clouds. It uses RMagick as graphic
6
+ backend.
7
+
8
+ Usage
9
+ -----
10
+
11
+ ```ruby
12
+ words = [
13
+ [test, 50],
14
+ [me, 40],
15
+ [tenderly, 30],
16
+ # ....
17
+ ]
18
+ cloud = MagicCloud::Cloud.new(words, rotate: :free, scale: :log)
19
+ ```
20
+
21
+ Or from command-line:
22
+
23
+ ```
24
+ ./bin/magic_cloud --textfile samples/cat-in-the-hat.txt -f test.png --rotate free --scale log
25
+ ```
26
+
27
+ Resulting in:
28
+
29
+ <img src="https://raw.github.com/zverok/magic_cloud/master/samples/cat.png" alt="Sample word cloud"/>
30
+
31
+ Installation
32
+ ------------
33
+
34
+ ```
35
+ gem install magic_cloud
36
+ ```
37
+
38
+ rmagick is requirement, and it's need compilation, so you may expect
39
+ problems in non-compiler-friendly environment (Windows).
40
+
41
+ Origins
42
+ -------
43
+
44
+ At first, it was straightforward port of [d3.layout.cloud.js](https://github.com/jasondavies/d3-cloud)
45
+ by Jason Davies, which, I assume, is an implementation of Wordle algorithm.
46
+
47
+ Then there was major refatoring, to make code correspond to Ruby
48
+ standards (and be understandable to poor dumb me).
49
+
50
+ Then collision algorithm was rewritten from scratch.
51
+
52
+ And now we are here.
53
+
54
+ References:
55
+ * https://github.com/jasondavies/d3-cloud
56
+ * http://stackoverflow.com/questions/342687/algorithm-to-implement-a-word-cloud-like-wordle
57
+ * http://static.mrfeinberg.com/bv_ch03.pdf
58
+
59
+ Performance
60
+ -----------
61
+
62
+ It's reasonable for me. On my small Thinkpad E330, some 50-words cloud
63
+ image, size 700×500, are typically generated in <3sec. It's not that cool,
64
+ yet not too long for you to fell asleep. The time of cloud making depends
65
+ on words count, size of image (it's faster to find place for all words
66
+ on larger image) and used rotation algorithm (vertical+horizontal words
67
+ only is significantly faster - and, on my opinion, better looking - than
68
+ "cool" free-rotated-words cloud).
69
+
70
+ Major performance eater is perfect collision detection, which Wordle-like
71
+ cloud needs. MagicCloud for now uses really dumb algortihm with some
72
+ not-so-dumb optimizations. You can look into
73
+ `lib/magic_cloud/collision_board.rb` - everything can be optimized is
74
+ there; especially in `CollisionBoard#collides?` method.
75
+
76
+ I assume, for example, that naive rewriting of code in there as a C
77
+ extension can help significantly.
78
+
79
+ Another possible way is adding some smart tricks, which eliminate as much
80
+ of pixel-by-pixel comparisons as possible (some of already made are
81
+ criss-cross intersection check, and memoizing of last crossed sprite).
82
+
83
+ Memory effectiviness
84
+ --------------------
85
+
86
+ Basically: it's not.
87
+
88
+ Plain Ruby arrays are used to represent collision bitmasks (each array
89
+ member stand for 1 bit), so, for example, 700×500 pixel cloud will requre
90
+ collision board size `700*500` (i.e. 350k array items only for board, and
91
+ slightly less for all sprites).
92
+
93
+ It should be wise to use some packing (considering each Ruby Fixmnum can
94
+ represent not one, but whole 32 bits). Unfortunately, all bit array
95
+ libraries I've tried are causing major slowdown of cloud computation.
96
+ With, say, 50 words we'll have literally millions of operation
97
+ `bitmask#[]` and `bitmask#[]=`, so, even methods
98
+ like `Fixnum#&` and `Fixnum#|` (typically used for bit array representation)
99
+ are causing significant overload.
100
+
101
+ Configuration
102
+ -------------
103
+
104
+ ```ruby
105
+ cloud = MagicCloud.new(words, palette: palette, rotate: rotate)
106
+ ```
107
+
108
+ * `:palette` (default is `:color20`):
109
+ * `:category10`, `:category20`, ... - from (d3)[https://github.com/mbostock/d3/wiki/Ordinal-Scales#categorical-colors]
110
+ * `[array, of, colors]` - each color should be hex color, or any other RMagick color string (See "Color names at http://www.imagemagick.org/RMagick/doc/imusage.html)
111
+ * any lambda, accepting `(word, index)` and returning color string
112
+ * any object, responding to `color(word, index)` - so, you can make color
113
+ depend on tag text, not only on its number in tags list
114
+ * `:rotate` - rotation algorithm:
115
+ * `:square` (only horizontal and vertical words) - it's default
116
+ * `:none` - all words are horizontal (looks boooring)
117
+ * `:free` - any word rotation angle, looks cool, but not very readable
118
+ and slower to layout
119
+ * `[array, of, angles]` - each of possible angle should be number 0..360
120
+ * any lambda, accepting `(word, index)` and returning 0..360
121
+ * any object, responding to `rotate(word, index)` and returning 0..360
122
+ * `:scale` - how word sizes would be scaled to fit into (FONT_MIN..FONT_MAX) range:
123
+ * `:no` - no scaling, all word sizes are treated as is;
124
+ * `:linear` - linear scaling (default);
125
+ * `:log` - logarithmic scaling;
126
+ * `:sqrt` - square root scaling;
127
+ * `:font_family` (Impact is default).
128
+
129
+ Current state
130
+ -------------
131
+
132
+ This library is extracted from real-life project. It should be
133
+ pretty stable (apart from bugs introduced during extraction and gemification).
134
+
135
+ What it really lacks for now, is thorough (or any) testing, and
136
+ some more configuration options.
137
+
138
+ Also, while core algorithms (collision_board.rb, layouter.rb) are pretty
139
+ accurately written and documented, "wrapping code" (options parsing and
140
+ so on) are a bit more chaotic - it's subject to refactor and cleanup.
141
+
142
+ All feedback, usage examples, bug reports and feature requests are appreciated!
data/bin/magic_cloud ADDED
@@ -0,0 +1,86 @@
1
+ #!/usr/bin/env ruby
2
+ #encoding: utf-8
3
+ require 'rubygems'
4
+ require 'slop'
5
+
6
+ $:.unshift 'lib'
7
+ require 'magic_cloud'
8
+
9
+ opts = Slop.parse(help: true, strict: true) do
10
+ on :width=, "Cloud width, pixels (default 960)", as: Integer, default: 960
11
+ on :height=, "Cloud heigh, pixels (default 600)", as: Integer, default: 600
12
+
13
+ on :words=,
14
+ "List of words with weights, comma-separated, weight after colon, like 'cat:40,dog:30,eat:20'",
15
+ as: Array
16
+ on :textfile=, "Path to file with some text, magic_cloud will calculate frequencies of words in this text and draw the cloud of them"
17
+ on :maxwords=, "Max words to make a cloud", as: Integer, default: 100
18
+
19
+ on :palette=, "Palette name, see README for details"
20
+ on :rotate=, "Rotation algo, see README for details"
21
+ on :scale=, "Scaling algo, see README for details"
22
+ on :font_family=, "Font family (Impact by default)"
23
+
24
+ on :f, :file=, "Output file path", required: true
25
+
26
+ on :stats, "Output debug log and stats (different operations count while looking for places for words)"
27
+ on :profile, "Run profiler"
28
+ end
29
+
30
+ include MagicCloud
31
+
32
+ Debug.logger.level = Logger::INFO if opts.stats?
33
+
34
+ words = case
35
+ when opts[:words]
36
+ opts[:words].
37
+ map{|w| w.split(':')}.
38
+ map{|w,c| [w, c.to_i]}.
39
+ each{|w,c| c.zero? and fail(ArgumentError, "Count for word #{w} not defined")}
40
+ when opts[:textfile]
41
+ WORD_SEPARATORS = /[\s\u3031-\u3035\u309b\u309c\u30a0\u30fc\uff70]+/
42
+ STOPWORDS = /^(i|me|my|myself|we|us|our|ours|ourselves|you|your|yours|yourself|yourselves|he|him|his|himself|she|her|hers|herself|it|its|itself|they|them|their|theirs|themselves|what|which|who|whom|whose|this|that|these|those|am|is|are|was|were|be|been|being|have|has|had|having|do|does|did|doing|will|would|should|can|could|ought|i'm|you're|he's|she's|it's|we're|they're|i've|you've|we've|they've|i'd|you'd|he'd|she'd|we'd|they'd|i'll|you'll|he'll|she'll|we'll|they'll|isn't|aren't|wasn't|weren't|hasn't|haven't|hadn't|doesn't|don't|didn't|won't|wouldn't|shan't|shouldn't|can't|cannot|couldn't|mustn't|let's|that's|who's|what's|here's|there's|when's|where's|why's|how's|a|an|the|and|but|if|or|because|as|until|while|of|at|by|for|with|about|against|between|into|through|during|before|after|above|below|to|from|up|upon|down|in|out|on|off|over|under|again|further|then|once|here|there|when|where|why|how|all|any|both|each|few|more|most|other|some|such|no|nor|not|only|own|same|so|than|too|very|say|says|said|shall)$/
43
+
44
+ File.read(opts[:textfile]).
45
+ split(WORD_SEPARATORS).
46
+ map{|word| word.gsub(/[[:punct:]]/, '')}.
47
+ map(&:downcase).
48
+ reject{|word| word =~ STOPWORDS}.
49
+ group_by{|w| w}.
50
+ map{|word, group| [word, group.size]}
51
+ else
52
+ fail ArgumentError, "You should provide either --words or --textfile option, where can I take words for cloud?.."
53
+ end
54
+
55
+ words = words.sort_by(&:last).reverse.first(opts[:maxwords])
56
+
57
+ options = {
58
+ palette: opts[:palette] && opts[:palette].to_sym,
59
+ rotate: opts[:rotate] && opts[:rotate].to_sym,
60
+ scale: opts[:scale] && opts[:scale].to_sym,
61
+ font_family: opts[:font_family]
62
+ }.reject{|k, v| v.nil?}
63
+
64
+ if opts.profile?
65
+ require 'ruby-prof'
66
+ RubyProf.start
67
+ end
68
+
69
+ start = Time.now
70
+
71
+ cloud = Cloud.new(words, options)
72
+ img = cloud.draw(opts[:width], opts[:height])
73
+
74
+ if opts.profile?
75
+ result = RubyProf.stop
76
+ require 'fileutils'
77
+ FileUtils.mkdir_p 'profile'
78
+ RubyProf::GraphHtmlPrinter.new(result).
79
+ print(File.open('profile/result.html', 'w'))
80
+ end
81
+
82
+ img.write(opts[:file])
83
+
84
+ p Debug.stats if opts.stats?
85
+
86
+ puts "Ready in %.2f seconds" % (Time.now - start)
@@ -0,0 +1,7 @@
1
+ # encoding: utf-8
2
+
3
+ # Wordle-like word cloud main module
4
+ module MagicCloud
5
+ end
6
+
7
+ require_relative 'magic_cloud/cloud'
@@ -0,0 +1,33 @@
1
+ # encoding: utf-8
2
+ module MagicCloud
3
+ # Dead simple 2-dimensional "bit matrix", storing 1s and 0s.
4
+ # Not memory effectife at all, but the fastest pure-Ruby solution
5
+ # I've tried.
6
+ class BitMatrix
7
+ def initialize(width, height)
8
+ @width, @height = width, height
9
+ @bits = [0] * height*width
10
+ end
11
+
12
+ attr_reader :bits, :width, :height
13
+
14
+ def put(x, y, px = 1)
15
+ x < width or fail("#{x} outside matrix: #{width}")
16
+ y < height or fail("#{y} outside matrix: #{height}")
17
+
18
+ bits[y*@width + x] = 1 unless px == 0 # It's faster with unless
19
+ end
20
+
21
+ # returns true/false
22
+ # FIXME: maybe #put should also accept true/false
23
+ def at(x, y)
24
+ bits[y*@width + x] != 0 # faster than .zero?
25
+ end
26
+
27
+ def dump
28
+ (0...height).map{|y|
29
+ (0...width).map{|x| at(x, y) ? ' ' : 'x'}.join
30
+ }.join("\n")
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,93 @@
1
+ # encoding: utf-8
2
+ require 'RMagick'
3
+
4
+ module MagicCloud
5
+ # Thin wrapper around RMagick, incapsulating ALL the real drawing.
6
+ # As it's only class that "knows" about underlying graphics library,
7
+ # it should be possible to replace it with another canvas with same
8
+ # interface, not using RMagick.
9
+ class Canvas
10
+ def initialize(w, h, back = 'transparent')
11
+ @width, @height = w, h
12
+ @internal = Magick::Image.new(w, h){|i| i.background_color = back}
13
+ end
14
+
15
+ attr_reader :internal, :width, :height
16
+
17
+ RADIANS = Math::PI / 180
18
+
19
+ def draw_text(text, options = {})
20
+ draw = Magick::Draw.new # FIXME: is it necessary every time?
21
+
22
+ x = options.fetch(:x, 0)
23
+ y = options.fetch(:y, 0)
24
+ rotate = options.fetch(:rotate, 0)
25
+
26
+ set_text_options(draw, options)
27
+
28
+ rect = _measure_text(draw, text, rotate)
29
+
30
+ draw.
31
+ translate(x + rect.width/2, y + rect.height/2).
32
+ rotate(rotate).
33
+ translate(0, rect.height/8). # RMagick text_align seems really weird
34
+ text(0, 0, text).
35
+ draw(@internal)
36
+
37
+ rect
38
+ end
39
+
40
+ def measure_text(text, options)
41
+ draw = Magick::Draw.new
42
+ set_text_options(draw, options)
43
+ _measure_text(draw, text, options.fetch(:rotate, 0))
44
+ end
45
+
46
+ def pixels(x, y, w, h)
47
+ @internal.export_pixels(x, y, w, h, 'RGBA')
48
+ end
49
+
50
+ # rubocop:disable TrivialAccessors
51
+ def render
52
+ @internal
53
+ end
54
+ # rubocop:enable TrivialAccessors
55
+
56
+ private
57
+
58
+ def set_text_options(draw, options)
59
+ draw.font_family = options[:font_family]
60
+ draw.font_weight = Magick::NormalWeight
61
+ draw.font_style = Magick::NormalStyle
62
+
63
+ draw.pointsize = options[:font_size]
64
+ draw.fill_color(options[:color])
65
+ draw.gravity(Magick::CenterGravity)
66
+ draw.text_align(Magick::CenterAlign)
67
+ end
68
+
69
+ def _measure_text(draw, text, rotate)
70
+ metrics = draw.get_type_metrics('"' + text + 'm"')
71
+ w, h = rotated_metrics(metrics.width, metrics.height, rotate)
72
+
73
+ Rect.new(0, 0, w, h)
74
+ end
75
+
76
+ def rotated_metrics(w, h, degrees)
77
+ radians = degrees * Math::PI / 180
78
+
79
+ # FIXME: not too clear, just straightforward from d3.cloud
80
+ sr = Math.sin(radians)
81
+ cr = Math.cos(radians)
82
+ wcr = w * cr
83
+ wsr = w * sr
84
+ hcr = h * cr
85
+ hsr = h * sr
86
+
87
+ w = [(wcr + hsr).abs, (wcr - hsr).abs].max.to_i
88
+ h = [(wsr + hcr).abs, (wsr - hcr).abs].max.to_i
89
+
90
+ [w, h]
91
+ end
92
+ end
93
+ end
@@ -0,0 +1,137 @@
1
+ # encoding: utf-8
2
+ require_relative './rect'
3
+ require_relative './canvas'
4
+ require_relative './palettes'
5
+
6
+ require_relative './word'
7
+
8
+ require_relative './layouter'
9
+ require_relative './spriter'
10
+
11
+ require_relative './debug'
12
+
13
+ module MagicCloud
14
+ # Main word-cloud class. Takes words with sizes, returns image
15
+ class Cloud
16
+ def initialize(words, options = {})
17
+ @words = words.sort_by(&:last).reverse
18
+ @options = options
19
+ @scaler = make_scaler(words, options[:scale] || :log)
20
+ @rotator = make_rotator(options[:rotate] || :square)
21
+ @palette = make_palette(options[:palette] || :default)
22
+ end
23
+
24
+ DEFAULT_FAMILY = 'Impact'
25
+
26
+ def draw(width, height)
27
+ # FIXME: do it in init, for specs would be happy
28
+ shapes = @words.each_with_index.map{|(word, size), i|
29
+ Word.new(
30
+ word,
31
+ font_family: @options[:font_family] || DEFAULT_FAMILY,
32
+ font_size: scaler.call(word, size, i),
33
+ color: palette.call(word, i),
34
+ rotate: rotator.call(word, i)
35
+ )
36
+ }
37
+
38
+ Debug.reset!
39
+
40
+ spriter = Spriter.new
41
+ spriter.make_sprites!(shapes)
42
+
43
+ layouter = Layouter.new(width, height)
44
+ visible = layouter.layout!(shapes)
45
+
46
+ canvas = Canvas.new(width, height, 'white')
47
+ visible.each{|sh| sh.draw(canvas)}
48
+
49
+ canvas.render
50
+ end
51
+
52
+ private
53
+
54
+ attr_reader :palette, :rotator, :scaler
55
+
56
+ # rubocop:disable Metrics/MethodLength, Metrics/CyclomaticComplexity,Metrics/AbcSize
57
+ def make_palette(source)
58
+ case source
59
+ when :default
60
+ make_const_palette(:category20)
61
+ when Symbol
62
+ make_const_palette(source)
63
+ when Array
64
+ ->(_, index){source[index % source.size]}
65
+ when Proc
66
+ source
67
+ when ->(s){s.respond_to?(:color)}
68
+ ->(word, index){source.color(word, index)}
69
+ else
70
+ fail ArgumentError, "Unknown palette: #{source.inspect}"
71
+ end
72
+ end
73
+
74
+ def make_const_palette(sym)
75
+ palette = PALETTES[sym] or
76
+ fail(ArgumentError, "Unknown palette: #{sym.inspect}")
77
+
78
+ ->(_, index){palette[index % palette.size]}
79
+ end
80
+
81
+ def make_rotator(source)
82
+ case source
83
+ when :none
84
+ ->(*){0}
85
+ when :square
86
+ ->(*){
87
+ (rand * 2).to_i * 90
88
+ }
89
+ when :free
90
+ ->(*){
91
+ (((rand * 6) - 3) * 30).round
92
+ }
93
+ when Array
94
+ ->(*){
95
+ source.sample
96
+ }
97
+ when Proc
98
+ source
99
+ when ->(s){s.respond_to?(:rotate)}
100
+ ->(word, index){source.rotate(word, index)}
101
+ else
102
+ fail ArgumentError, "Unknown rotation algo: #{source.inspect}"
103
+ end
104
+ end
105
+
106
+ # FIXME: should be options too
107
+ FONT_MIN = 10
108
+ FONT_MAX = 100
109
+
110
+ def make_scaler(words, algo)
111
+ norm =
112
+ case algo
113
+ when :no
114
+ # no normalization, treat tag weights as font size
115
+ return ->(_word, size, _index){size}
116
+ when :linear
117
+ ->(x){x}
118
+ when :log
119
+ ->(x){Math.log(x) / Math.log(10)}
120
+ when :sqrt
121
+ ->(x){Math.sqrt(x)}
122
+ else
123
+ fail ArgumentError, "Unknown scaling algo: #{algo.inspect}"
124
+ end
125
+
126
+ smin = norm.call(words.map(&:last).min)
127
+ smax = norm.call(words.map(&:last).max)
128
+ koeff = (FONT_MAX - FONT_MIN).to_f / (smax - smin)
129
+
130
+ ->(_word, size, _index){
131
+ ssize = norm.call(size)
132
+ ((ssize - smin).to_f * koeff + FONT_MIN).to_i
133
+ }
134
+ end
135
+ # rubocop:enable Metrics/MethodLength, Metrics/CyclomaticComplexity,Metrics/AbcSize
136
+ end
137
+ end