magic_cloud 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: b7ed426c54152c66311804b851d214ff6973a37d
4
+ data.tar.gz: 45354e6f6d66aa759bb14ddf19c273b3f03a9687
5
+ SHA512:
6
+ metadata.gz: 85a631a786268da87479796fa4b61e40e0ed043da562662717f92c120484321a362d37baf231eece4790b629cf5a0c76ed1aebc7dd4825d7ecf675353dea0d21
7
+ data.tar.gz: 3e497cb23c8068ab2ac1761e6298c6e6ffa649b0776df1fe113224c682e841e7c1653800f49cf3a9dfaf130fe07e6921212d2fa1285acfe0111d88e73a50e786
data/LICENSE.txt ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2014-15 Victor 'Zverok' Shepelev
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,142 @@
1
+ MagicCloud - simple pretty word cloud for Ruby
2
+ ==============================================
3
+
4
+ **MagicCloud** is simple, pure-ruby library for making pretty
5
+ [Wordle](http://www.wordle.net/)-like clouds. It uses RMagick as graphic
6
+ backend.
7
+
8
+ Usage
9
+ -----
10
+
11
+ ```ruby
12
+ words = [
13
+ [test, 50],
14
+ [me, 40],
15
+ [tenderly, 30],
16
+ # ....
17
+ ]
18
+ cloud = MagicCloud::Cloud.new(words, rotate: :free, scale: :log)
19
+ ```
20
+
21
+ Or from command-line:
22
+
23
+ ```
24
+ ./bin/magic_cloud --textfile samples/cat-in-the-hat.txt -f test.png --rotate free --scale log
25
+ ```
26
+
27
+ Resulting in:
28
+
29
+ <img src="https://raw.github.com/zverok/magic_cloud/master/samples/cat.png" alt="Sample word cloud"/>
30
+
31
+ Installation
32
+ ------------
33
+
34
+ ```
35
+ gem install magic_cloud
36
+ ```
37
+
38
+ rmagick is requirement, and it's need compilation, so you may expect
39
+ problems in non-compiler-friendly environment (Windows).
40
+
41
+ Origins
42
+ -------
43
+
44
+ At first, it was straightforward port of [d3.layout.cloud.js](https://github.com/jasondavies/d3-cloud)
45
+ by Jason Davies, which, I assume, is an implementation of Wordle algorithm.
46
+
47
+ Then there was major refatoring, to make code correspond to Ruby
48
+ standards (and be understandable to poor dumb me).
49
+
50
+ Then collision algorithm was rewritten from scratch.
51
+
52
+ And now we are here.
53
+
54
+ References:
55
+ * https://github.com/jasondavies/d3-cloud
56
+ * http://stackoverflow.com/questions/342687/algorithm-to-implement-a-word-cloud-like-wordle
57
+ * http://static.mrfeinberg.com/bv_ch03.pdf
58
+
59
+ Performance
60
+ -----------
61
+
62
+ It's reasonable for me. On my small Thinkpad E330, some 50-words cloud
63
+ image, size 700×500, are typically generated in <3sec. It's not that cool,
64
+ yet not too long for you to fell asleep. The time of cloud making depends
65
+ on words count, size of image (it's faster to find place for all words
66
+ on larger image) and used rotation algorithm (vertical+horizontal words
67
+ only is significantly faster - and, on my opinion, better looking - than
68
+ "cool" free-rotated-words cloud).
69
+
70
+ Major performance eater is perfect collision detection, which Wordle-like
71
+ cloud needs. MagicCloud for now uses really dumb algortihm with some
72
+ not-so-dumb optimizations. You can look into
73
+ `lib/magic_cloud/collision_board.rb` - everything can be optimized is
74
+ there; especially in `CollisionBoard#collides?` method.
75
+
76
+ I assume, for example, that naive rewriting of code in there as a C
77
+ extension can help significantly.
78
+
79
+ Another possible way is adding some smart tricks, which eliminate as much
80
+ of pixel-by-pixel comparisons as possible (some of already made are
81
+ criss-cross intersection check, and memoizing of last crossed sprite).
82
+
83
+ Memory effectiviness
84
+ --------------------
85
+
86
+ Basically: it's not.
87
+
88
+ Plain Ruby arrays are used to represent collision bitmasks (each array
89
+ member stand for 1 bit), so, for example, 700×500 pixel cloud will requre
90
+ collision board size `700*500` (i.e. 350k array items only for board, and
91
+ slightly less for all sprites).
92
+
93
+ It should be wise to use some packing (considering each Ruby Fixmnum can
94
+ represent not one, but whole 32 bits). Unfortunately, all bit array
95
+ libraries I've tried are causing major slowdown of cloud computation.
96
+ With, say, 50 words we'll have literally millions of operation
97
+ `bitmask#[]` and `bitmask#[]=`, so, even methods
98
+ like `Fixnum#&` and `Fixnum#|` (typically used for bit array representation)
99
+ are causing significant overload.
100
+
101
+ Configuration
102
+ -------------
103
+
104
+ ```ruby
105
+ cloud = MagicCloud.new(words, palette: palette, rotate: rotate)
106
+ ```
107
+
108
+ * `:palette` (default is `:color20`):
109
+ * `:category10`, `:category20`, ... - from (d3)[https://github.com/mbostock/d3/wiki/Ordinal-Scales#categorical-colors]
110
+ * `[array, of, colors]` - each color should be hex color, or any other RMagick color string (See "Color names at http://www.imagemagick.org/RMagick/doc/imusage.html)
111
+ * any lambda, accepting `(word, index)` and returning color string
112
+ * any object, responding to `color(word, index)` - so, you can make color
113
+ depend on tag text, not only on its number in tags list
114
+ * `:rotate` - rotation algorithm:
115
+ * `:square` (only horizontal and vertical words) - it's default
116
+ * `:none` - all words are horizontal (looks boooring)
117
+ * `:free` - any word rotation angle, looks cool, but not very readable
118
+ and slower to layout
119
+ * `[array, of, angles]` - each of possible angle should be number 0..360
120
+ * any lambda, accepting `(word, index)` and returning 0..360
121
+ * any object, responding to `rotate(word, index)` and returning 0..360
122
+ * `:scale` - how word sizes would be scaled to fit into (FONT_MIN..FONT_MAX) range:
123
+ * `:no` - no scaling, all word sizes are treated as is;
124
+ * `:linear` - linear scaling (default);
125
+ * `:log` - logarithmic scaling;
126
+ * `:sqrt` - square root scaling;
127
+ * `:font_family` (Impact is default).
128
+
129
+ Current state
130
+ -------------
131
+
132
+ This library is extracted from real-life project. It should be
133
+ pretty stable (apart from bugs introduced during extraction and gemification).
134
+
135
+ What it really lacks for now, is thorough (or any) testing, and
136
+ some more configuration options.
137
+
138
+ Also, while core algorithms (collision_board.rb, layouter.rb) are pretty
139
+ accurately written and documented, "wrapping code" (options parsing and
140
+ so on) are a bit more chaotic - it's subject to refactor and cleanup.
141
+
142
+ All feedback, usage examples, bug reports and feature requests are appreciated!
data/bin/magic_cloud ADDED
@@ -0,0 +1,86 @@
1
+ #!/usr/bin/env ruby
2
+ #encoding: utf-8
3
+ require 'rubygems'
4
+ require 'slop'
5
+
6
+ $:.unshift 'lib'
7
+ require 'magic_cloud'
8
+
9
+ opts = Slop.parse(help: true, strict: true) do
10
+ on :width=, "Cloud width, pixels (default 960)", as: Integer, default: 960
11
+ on :height=, "Cloud heigh, pixels (default 600)", as: Integer, default: 600
12
+
13
+ on :words=,
14
+ "List of words with weights, comma-separated, weight after colon, like 'cat:40,dog:30,eat:20'",
15
+ as: Array
16
+ on :textfile=, "Path to file with some text, magic_cloud will calculate frequencies of words in this text and draw the cloud of them"
17
+ on :maxwords=, "Max words to make a cloud", as: Integer, default: 100
18
+
19
+ on :palette=, "Palette name, see README for details"
20
+ on :rotate=, "Rotation algo, see README for details"
21
+ on :scale=, "Scaling algo, see README for details"
22
+ on :font_family=, "Font family (Impact by default)"
23
+
24
+ on :f, :file=, "Output file path", required: true
25
+
26
+ on :stats, "Output debug log and stats (different operations count while looking for places for words)"
27
+ on :profile, "Run profiler"
28
+ end
29
+
30
+ include MagicCloud
31
+
32
+ Debug.logger.level = Logger::INFO if opts.stats?
33
+
34
+ words = case
35
+ when opts[:words]
36
+ opts[:words].
37
+ map{|w| w.split(':')}.
38
+ map{|w,c| [w, c.to_i]}.
39
+ each{|w,c| c.zero? and fail(ArgumentError, "Count for word #{w} not defined")}
40
+ when opts[:textfile]
41
+ WORD_SEPARATORS = /[\s\u3031-\u3035\u309b\u309c\u30a0\u30fc\uff70]+/
42
+ STOPWORDS = /^(i|me|my|myself|we|us|our|ours|ourselves|you|your|yours|yourself|yourselves|he|him|his|himself|she|her|hers|herself|it|its|itself|they|them|their|theirs|themselves|what|which|who|whom|whose|this|that|these|those|am|is|are|was|were|be|been|being|have|has|had|having|do|does|did|doing|will|would|should|can|could|ought|i'm|you're|he's|she's|it's|we're|they're|i've|you've|we've|they've|i'd|you'd|he'd|she'd|we'd|they'd|i'll|you'll|he'll|she'll|we'll|they'll|isn't|aren't|wasn't|weren't|hasn't|haven't|hadn't|doesn't|don't|didn't|won't|wouldn't|shan't|shouldn't|can't|cannot|couldn't|mustn't|let's|that's|who's|what's|here's|there's|when's|where's|why's|how's|a|an|the|and|but|if|or|because|as|until|while|of|at|by|for|with|about|against|between|into|through|during|before|after|above|below|to|from|up|upon|down|in|out|on|off|over|under|again|further|then|once|here|there|when|where|why|how|all|any|both|each|few|more|most|other|some|such|no|nor|not|only|own|same|so|than|too|very|say|says|said|shall)$/
43
+
44
+ File.read(opts[:textfile]).
45
+ split(WORD_SEPARATORS).
46
+ map{|word| word.gsub(/[[:punct:]]/, '')}.
47
+ map(&:downcase).
48
+ reject{|word| word =~ STOPWORDS}.
49
+ group_by{|w| w}.
50
+ map{|word, group| [word, group.size]}
51
+ else
52
+ fail ArgumentError, "You should provide either --words or --textfile option, where can I take words for cloud?.."
53
+ end
54
+
55
+ words = words.sort_by(&:last).reverse.first(opts[:maxwords])
56
+
57
+ options = {
58
+ palette: opts[:palette] && opts[:palette].to_sym,
59
+ rotate: opts[:rotate] && opts[:rotate].to_sym,
60
+ scale: opts[:scale] && opts[:scale].to_sym,
61
+ font_family: opts[:font_family]
62
+ }.reject{|k, v| v.nil?}
63
+
64
+ if opts.profile?
65
+ require 'ruby-prof'
66
+ RubyProf.start
67
+ end
68
+
69
+ start = Time.now
70
+
71
+ cloud = Cloud.new(words, options)
72
+ img = cloud.draw(opts[:width], opts[:height])
73
+
74
+ if opts.profile?
75
+ result = RubyProf.stop
76
+ require 'fileutils'
77
+ FileUtils.mkdir_p 'profile'
78
+ RubyProf::GraphHtmlPrinter.new(result).
79
+ print(File.open('profile/result.html', 'w'))
80
+ end
81
+
82
+ img.write(opts[:file])
83
+
84
+ p Debug.stats if opts.stats?
85
+
86
+ puts "Ready in %.2f seconds" % (Time.now - start)
@@ -0,0 +1,7 @@
1
+ # encoding: utf-8
2
+
3
+ # Wordle-like word cloud main module
4
+ module MagicCloud
5
+ end
6
+
7
+ require_relative 'magic_cloud/cloud'
@@ -0,0 +1,33 @@
1
+ # encoding: utf-8
2
+ module MagicCloud
3
+ # Dead simple 2-dimensional "bit matrix", storing 1s and 0s.
4
+ # Not memory effectife at all, but the fastest pure-Ruby solution
5
+ # I've tried.
6
+ class BitMatrix
7
+ def initialize(width, height)
8
+ @width, @height = width, height
9
+ @bits = [0] * height*width
10
+ end
11
+
12
+ attr_reader :bits, :width, :height
13
+
14
+ def put(x, y, px = 1)
15
+ x < width or fail("#{x} outside matrix: #{width}")
16
+ y < height or fail("#{y} outside matrix: #{height}")
17
+
18
+ bits[y*@width + x] = 1 unless px == 0 # It's faster with unless
19
+ end
20
+
21
+ # returns true/false
22
+ # FIXME: maybe #put should also accept true/false
23
+ def at(x, y)
24
+ bits[y*@width + x] != 0 # faster than .zero?
25
+ end
26
+
27
+ def dump
28
+ (0...height).map{|y|
29
+ (0...width).map{|x| at(x, y) ? ' ' : 'x'}.join
30
+ }.join("\n")
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,93 @@
1
+ # encoding: utf-8
2
+ require 'RMagick'
3
+
4
+ module MagicCloud
5
+ # Thin wrapper around RMagick, incapsulating ALL the real drawing.
6
+ # As it's only class that "knows" about underlying graphics library,
7
+ # it should be possible to replace it with another canvas with same
8
+ # interface, not using RMagick.
9
+ class Canvas
10
+ def initialize(w, h, back = 'transparent')
11
+ @width, @height = w, h
12
+ @internal = Magick::Image.new(w, h){|i| i.background_color = back}
13
+ end
14
+
15
+ attr_reader :internal, :width, :height
16
+
17
+ RADIANS = Math::PI / 180
18
+
19
+ def draw_text(text, options = {})
20
+ draw = Magick::Draw.new # FIXME: is it necessary every time?
21
+
22
+ x = options.fetch(:x, 0)
23
+ y = options.fetch(:y, 0)
24
+ rotate = options.fetch(:rotate, 0)
25
+
26
+ set_text_options(draw, options)
27
+
28
+ rect = _measure_text(draw, text, rotate)
29
+
30
+ draw.
31
+ translate(x + rect.width/2, y + rect.height/2).
32
+ rotate(rotate).
33
+ translate(0, rect.height/8). # RMagick text_align seems really weird
34
+ text(0, 0, text).
35
+ draw(@internal)
36
+
37
+ rect
38
+ end
39
+
40
+ def measure_text(text, options)
41
+ draw = Magick::Draw.new
42
+ set_text_options(draw, options)
43
+ _measure_text(draw, text, options.fetch(:rotate, 0))
44
+ end
45
+
46
+ def pixels(x, y, w, h)
47
+ @internal.export_pixels(x, y, w, h, 'RGBA')
48
+ end
49
+
50
+ # rubocop:disable TrivialAccessors
51
+ def render
52
+ @internal
53
+ end
54
+ # rubocop:enable TrivialAccessors
55
+
56
+ private
57
+
58
+ def set_text_options(draw, options)
59
+ draw.font_family = options[:font_family]
60
+ draw.font_weight = Magick::NormalWeight
61
+ draw.font_style = Magick::NormalStyle
62
+
63
+ draw.pointsize = options[:font_size]
64
+ draw.fill_color(options[:color])
65
+ draw.gravity(Magick::CenterGravity)
66
+ draw.text_align(Magick::CenterAlign)
67
+ end
68
+
69
+ def _measure_text(draw, text, rotate)
70
+ metrics = draw.get_type_metrics('"' + text + 'm"')
71
+ w, h = rotated_metrics(metrics.width, metrics.height, rotate)
72
+
73
+ Rect.new(0, 0, w, h)
74
+ end
75
+
76
+ def rotated_metrics(w, h, degrees)
77
+ radians = degrees * Math::PI / 180
78
+
79
+ # FIXME: not too clear, just straightforward from d3.cloud
80
+ sr = Math.sin(radians)
81
+ cr = Math.cos(radians)
82
+ wcr = w * cr
83
+ wsr = w * sr
84
+ hcr = h * cr
85
+ hsr = h * sr
86
+
87
+ w = [(wcr + hsr).abs, (wcr - hsr).abs].max.to_i
88
+ h = [(wsr + hcr).abs, (wsr - hcr).abs].max.to_i
89
+
90
+ [w, h]
91
+ end
92
+ end
93
+ end
@@ -0,0 +1,137 @@
1
+ # encoding: utf-8
2
+ require_relative './rect'
3
+ require_relative './canvas'
4
+ require_relative './palettes'
5
+
6
+ require_relative './word'
7
+
8
+ require_relative './layouter'
9
+ require_relative './spriter'
10
+
11
+ require_relative './debug'
12
+
13
+ module MagicCloud
14
+ # Main word-cloud class. Takes words with sizes, returns image
15
+ class Cloud
16
+ def initialize(words, options = {})
17
+ @words = words.sort_by(&:last).reverse
18
+ @options = options
19
+ @scaler = make_scaler(words, options[:scale] || :log)
20
+ @rotator = make_rotator(options[:rotate] || :square)
21
+ @palette = make_palette(options[:palette] || :default)
22
+ end
23
+
24
+ DEFAULT_FAMILY = 'Impact'
25
+
26
+ def draw(width, height)
27
+ # FIXME: do it in init, for specs would be happy
28
+ shapes = @words.each_with_index.map{|(word, size), i|
29
+ Word.new(
30
+ word,
31
+ font_family: @options[:font_family] || DEFAULT_FAMILY,
32
+ font_size: scaler.call(word, size, i),
33
+ color: palette.call(word, i),
34
+ rotate: rotator.call(word, i)
35
+ )
36
+ }
37
+
38
+ Debug.reset!
39
+
40
+ spriter = Spriter.new
41
+ spriter.make_sprites!(shapes)
42
+
43
+ layouter = Layouter.new(width, height)
44
+ visible = layouter.layout!(shapes)
45
+
46
+ canvas = Canvas.new(width, height, 'white')
47
+ visible.each{|sh| sh.draw(canvas)}
48
+
49
+ canvas.render
50
+ end
51
+
52
+ private
53
+
54
+ attr_reader :palette, :rotator, :scaler
55
+
56
+ # rubocop:disable Metrics/MethodLength, Metrics/CyclomaticComplexity,Metrics/AbcSize
57
+ def make_palette(source)
58
+ case source
59
+ when :default
60
+ make_const_palette(:category20)
61
+ when Symbol
62
+ make_const_palette(source)
63
+ when Array
64
+ ->(_, index){source[index % source.size]}
65
+ when Proc
66
+ source
67
+ when ->(s){s.respond_to?(:color)}
68
+ ->(word, index){source.color(word, index)}
69
+ else
70
+ fail ArgumentError, "Unknown palette: #{source.inspect}"
71
+ end
72
+ end
73
+
74
+ def make_const_palette(sym)
75
+ palette = PALETTES[sym] or
76
+ fail(ArgumentError, "Unknown palette: #{sym.inspect}")
77
+
78
+ ->(_, index){palette[index % palette.size]}
79
+ end
80
+
81
+ def make_rotator(source)
82
+ case source
83
+ when :none
84
+ ->(*){0}
85
+ when :square
86
+ ->(*){
87
+ (rand * 2).to_i * 90
88
+ }
89
+ when :free
90
+ ->(*){
91
+ (((rand * 6) - 3) * 30).round
92
+ }
93
+ when Array
94
+ ->(*){
95
+ source.sample
96
+ }
97
+ when Proc
98
+ source
99
+ when ->(s){s.respond_to?(:rotate)}
100
+ ->(word, index){source.rotate(word, index)}
101
+ else
102
+ fail ArgumentError, "Unknown rotation algo: #{source.inspect}"
103
+ end
104
+ end
105
+
106
+ # FIXME: should be options too
107
+ FONT_MIN = 10
108
+ FONT_MAX = 100
109
+
110
+ def make_scaler(words, algo)
111
+ norm =
112
+ case algo
113
+ when :no
114
+ # no normalization, treat tag weights as font size
115
+ return ->(_word, size, _index){size}
116
+ when :linear
117
+ ->(x){x}
118
+ when :log
119
+ ->(x){Math.log(x) / Math.log(10)}
120
+ when :sqrt
121
+ ->(x){Math.sqrt(x)}
122
+ else
123
+ fail ArgumentError, "Unknown scaling algo: #{algo.inspect}"
124
+ end
125
+
126
+ smin = norm.call(words.map(&:last).min)
127
+ smax = norm.call(words.map(&:last).max)
128
+ koeff = (FONT_MAX - FONT_MIN).to_f / (smax - smin)
129
+
130
+ ->(_word, size, _index){
131
+ ssize = norm.call(size)
132
+ ((ssize - smin).to_f * koeff + FONT_MIN).to_i
133
+ }
134
+ end
135
+ # rubocop:enable Metrics/MethodLength, Metrics/CyclomaticComplexity,Metrics/AbcSize
136
+ end
137
+ end