text-format 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/Changelog +89 -0
- data/Install +6 -0
- data/README +13 -0
- data/Rakefile +113 -0
- data/ToDo +8 -0
- data/lib/text/format.rb +1043 -0
- data/lib/text/format/alpha.rb +61 -0
- data/lib/text/format/number.rb +34 -0
- data/lib/text/format/roman.rb +105 -0
- data/metaconfig +0 -0
- data/pre-setup.rb +46 -0
- data/setup.rb +1366 -0
- data/tests/tc_text_format.rb +474 -0
- data/tests/testall.rb +20 -0
- metadata +76 -0
data/Changelog
ADDED
@@ -0,0 +1,89 @@
|
|
1
|
+
== Text::Format 1.0.0
|
2
|
+
* Changed installer: added a .gem package.
|
3
|
+
* Changed installer: moving to a variant of setup.rb by Minero Aoki.
|
4
|
+
* Fixed significant problems with #hard_margin wrapping and fallback issues,
|
5
|
+
eliminating all known possibilities for an infinite loop in wrapping. Some
|
6
|
+
of the formatting changes involved with this result in different and more
|
7
|
+
subtle wrapping and splitting of words; please read the full documentation
|
8
|
+
for details.
|
9
|
+
* Clarified the API for #hyphenate_to (delineated the return value required if
|
10
|
+
the hyphenator cannot hyphenate the word to the specified size).
|
11
|
+
* Changed a number of public and private API calls to work better. As long as
|
12
|
+
the constants provided by Text::Format have been used (and not direct access
|
13
|
+
to the constant values), there will be no issues presented by most of these
|
14
|
+
changes.
|
15
|
+
* Changed the initialization of the Text::Format object. The documentation has
|
16
|
+
also been updated to be correct. Note that this will mean that some uses of
|
17
|
+
Text::Format will not work, as Text::Format.new now yields +self+ if a block
|
18
|
+
is given instead of evaluating the block with Object#instance_eval.
|
19
|
+
* Added text numbering generators (Text::Format::Alpha, Text::Format::Number,
|
20
|
+
and Text::Format::Roman) to work with #tag_paragraphs and #tag_text to
|
21
|
+
generate numbered paragraphs.
|
22
|
+
* #nobreak_regex must be a hash of regular expressions, not strings that are
|
23
|
+
converted to regular expressions. This Perlism has finally been removed.
|
24
|
+
* The performance has been improved; the number of times that lines are joined
|
25
|
+
together and then split apart has been reduced.
|
26
|
+
* Changed the dependency to Text::Hyphen from TeX::Hyphen.
|
27
|
+
* Added auto-split capabilities to #paragraphs. See the updated documentation.
|
28
|
+
|
29
|
+
== Text::Format 0.64
|
30
|
+
* Fixed a bug where a NoMethod exception would be raised if #paragraphs was
|
31
|
+
called with either (" ") or ([" "]).
|
32
|
+
|
33
|
+
== Text::Format 0.63
|
34
|
+
* Fixed a bug where a crash would occur when a hyphenator returned nil instead
|
35
|
+
of "".
|
36
|
+
|
37
|
+
== Text::Format 0.62
|
38
|
+
* Modified the API for hyphenators. Previously, a hyphenator could only be
|
39
|
+
defined as an object containing a method #hyphenate_to with the signature:
|
40
|
+
#hyphenate_to(word, size)
|
41
|
+
Now, the #hyphenate_to method may be the above signature or:
|
42
|
+
#hyphenate_to(word, size, formatter)
|
43
|
+
So that the hyphenator may access information about the formatting object,
|
44
|
+
if necessary. Thanks to Tim Bates for suggesting a case where this would be
|
45
|
+
useful.
|
46
|
+
* Fixed a bug for strings matching /\A\s*\Z/ raising a NameError.
|
47
|
+
* Fixed a test case that failed uner 1.6.8. The following no longer works:
|
48
|
+
l, m1, m2 = /((?:\S+\s+){11})(.+)/.match(line)
|
49
|
+
This has been replaced with an explicit use of l[1] and l[2]. Thanks to Tim
|
50
|
+
Bates for finding this problem.
|
51
|
+
* Changed installer to Phil Thomson's install-package wrapper.
|
52
|
+
|
53
|
+
== Text::Format 0.61
|
54
|
+
* Fixed a problem with the installer. Note that Text::Format is no longer case
|
55
|
+
sensitive for require purposes. It will be required as:
|
56
|
+
require 'text/format'
|
57
|
+
Versions earlier than 0.60 were case-sensitive. Please be aware of this if
|
58
|
+
you are installing Text::Format over an older version. It may not replace
|
59
|
+
the existing library in the way that you expect.
|
60
|
+
|
61
|
+
== Text::Format 0.60
|
62
|
+
* Added Symbol equivalents for the Hash initialization. Hash initialization
|
63
|
+
has been modified so that values are set as follows (Symbols are highest
|
64
|
+
priority; strings are middle; defaults are lowest):
|
65
|
+
@columns = arg[:columns] || arg['columns'] || @columns
|
66
|
+
* Fixed a problem with Text::Format::RIGHT_FILL handling where a single word
|
67
|
+
is larger than #columns.
|
68
|
+
* Removed Comparable mixin (<=> doesn't make sense; == does).
|
69
|
+
* Added #hard_margins, #split_rules, #hyphenator, and #split_words. Text
|
70
|
+
formatted with #hard_margins will have words larger than #columns split
|
71
|
+
forcibly. Words forcibly split will be placed into #split_words. See the
|
72
|
+
documentation for important information on how this feature works.
|
73
|
+
|
74
|
+
== Text::Format 0.52.2
|
75
|
+
* Fixed the ordering of #<=> in cases of Boolean values.
|
76
|
+
* Fixed #expand and #unexpand Array handling.
|
77
|
+
* Added a Changelog.
|
78
|
+
|
79
|
+
== Text::Format 0.52.1
|
80
|
+
* Fixed a problem when tabs aren't counted properly.
|
81
|
+
* Changed #abbreviations from Hash to Array to better suit Ruby's
|
82
|
+
capabilities.
|
83
|
+
* Fixed problems with the way that Array arguments are handled in calls to the
|
84
|
+
major object types.
|
85
|
+
|
86
|
+
== Text::Format 0.52
|
87
|
+
* Initial release
|
88
|
+
|
89
|
+
$Id: Changelog,v 1.6 2005/06/24 19:49:09 austin Exp $
|
data/Install
ADDED
data/README
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
Text::Format 1.0.0
|
2
|
+
==================
|
3
|
+
Text::Format is provides the ability to nicely format fixed-width text with
|
4
|
+
knowledge of the writeable space (number of columns), margins, and indentation
|
5
|
+
settings. Text::Format can work with either TeX::Hyphen or Text::Hyphen to
|
6
|
+
hyphenate words when formatting.
|
7
|
+
|
8
|
+
This is release 1.0, containing both feature enhancements and bug fixes over
|
9
|
+
the previous version, 0.64.
|
10
|
+
|
11
|
+
Text::Format is originally based on the Perl library of the same name by G�bor
|
12
|
+
Egressy. It is copyright 2002 - 2005 by Austin Ziegler and is licenced under
|
13
|
+
Ruby's licence. It is also available under the Perl Artistic licence.
|
data/Rakefile
ADDED
@@ -0,0 +1,113 @@
|
|
1
|
+
#! /usr/bin/env rake
|
2
|
+
$LOAD_PATH.unshift('lib')
|
3
|
+
|
4
|
+
require 'rubygems'
|
5
|
+
require 'rake/gempackagetask'
|
6
|
+
require 'text/format'
|
7
|
+
require 'archive/tar/minitar'
|
8
|
+
require 'zlib'
|
9
|
+
|
10
|
+
DISTDIR = "text-format-#{Text::Format::VERSION}"
|
11
|
+
TARDIST = "../#{DISTDIR}.tar.gz"
|
12
|
+
|
13
|
+
DATE_RE = %r<(\d{4})[./-]?(\d{2})[./-]?(\d{2})(?:[\sT]?(\d{2})[:.]?(\d{2})[:.]?(\d{2})?)?>
|
14
|
+
|
15
|
+
if ENV['RELEASE_DATE']
|
16
|
+
year, month, day, hour, minute, second = DATE_RE.match(ENV['RELEASE_DATE']).captures
|
17
|
+
year ||= 0
|
18
|
+
month ||= 0
|
19
|
+
day ||= 0
|
20
|
+
hour ||= 0
|
21
|
+
minute ||= 0
|
22
|
+
second ||= 0
|
23
|
+
ReleaseDate = Time.mktime(year, month, day, hour, minute, second)
|
24
|
+
else
|
25
|
+
ReleaseDate = nil
|
26
|
+
end
|
27
|
+
|
28
|
+
task :test do |t|
|
29
|
+
require 'test/unit/testsuite'
|
30
|
+
require 'test/unit/ui/console/testrunner'
|
31
|
+
|
32
|
+
runner = Test::Unit::UI::Console::TestRunner
|
33
|
+
|
34
|
+
$LOAD_PATH.unshift('tests')
|
35
|
+
$stderr.puts "Checking for test cases:" if t.verbose
|
36
|
+
Dir['tests/tc_*.rb'].each do |testcase|
|
37
|
+
$stderr.puts "\t#{testcase}" if t.verbose
|
38
|
+
load testcase
|
39
|
+
end
|
40
|
+
|
41
|
+
suite = Test::Unit::TestSuite.new("Text::Format")
|
42
|
+
|
43
|
+
ObjectSpace.each_object(Class) do |testcase|
|
44
|
+
suite << testcase.suite if testcase < Test::Unit::TestCase
|
45
|
+
end
|
46
|
+
|
47
|
+
runner.run(suite)
|
48
|
+
end
|
49
|
+
|
50
|
+
spec = eval(File.read("text-format.gemspec"))
|
51
|
+
spec.version = Text::Format::VERSION
|
52
|
+
desc "Build the RubyGem for Text::Format"
|
53
|
+
task :gem => [ :test ]
|
54
|
+
Rake::GemPackageTask.new(spec) do |g|
|
55
|
+
g.need_tar = false
|
56
|
+
g.need_zip = false
|
57
|
+
g.package_dir = ".."
|
58
|
+
end
|
59
|
+
|
60
|
+
desc "Build a Text::Format .tar.gz distribution."
|
61
|
+
task :tar => [ TARDIST ]
|
62
|
+
file TARDIST => [ :test ] do |t|
|
63
|
+
current = File.basename(Dir.pwd)
|
64
|
+
Dir.chdir("..") do
|
65
|
+
begin
|
66
|
+
files = Dir["#{current}/**/*"].select { |dd| dd !~ %r{(?:/CVS/?|~$)} }
|
67
|
+
files.map! do |dd|
|
68
|
+
ddnew = dd.gsub(/^#{current}/, DISTDIR)
|
69
|
+
mtime = ReleaseDate || File.stat(dd).mtime
|
70
|
+
if File.directory?(dd)
|
71
|
+
{ :name => ddnew, :mode => 0755, :dir => true, :mtime => mtime }
|
72
|
+
else
|
73
|
+
if dd =~ %r{bin/}
|
74
|
+
mode = 0755
|
75
|
+
else
|
76
|
+
mode = 0644
|
77
|
+
end
|
78
|
+
data = File.read(dd)
|
79
|
+
{ :name => ddnew, :mode => mode, :data => data, :size => data.size,
|
80
|
+
:mtime => mtime }
|
81
|
+
end
|
82
|
+
end
|
83
|
+
|
84
|
+
ff = File.open(t.name.gsub(%r{^\.\./}o, ''), "wb")
|
85
|
+
gz = Zlib::GzipWriter.new(ff)
|
86
|
+
tw = Archive::Tar::Minitar::Writer.new(gz)
|
87
|
+
|
88
|
+
files.each do |entry|
|
89
|
+
if entry[:dir]
|
90
|
+
tw.mkdir(entry[:name], entry)
|
91
|
+
else
|
92
|
+
tw.add_file_simple(entry[:name], entry) { |os| os.write(entry[:data]) }
|
93
|
+
end
|
94
|
+
end
|
95
|
+
ensure
|
96
|
+
tw.close if tw
|
97
|
+
gz.close if gz
|
98
|
+
end
|
99
|
+
end
|
100
|
+
end
|
101
|
+
task TARDIST => [ :test ]
|
102
|
+
|
103
|
+
desc "Build the RDoc documentation for Text::Format"
|
104
|
+
task :docs do
|
105
|
+
require 'rdoc/rdoc'
|
106
|
+
rdoc_options = %w(--title Text::Format --main README --line-numbers)
|
107
|
+
files = FileList[*%w(README ChangeLog Install bin/**/*.rb lib/**/*.rb)]
|
108
|
+
rdoc_options += files.to_a
|
109
|
+
RDoc::RDoc.new.document(rdoc_options)
|
110
|
+
end
|
111
|
+
|
112
|
+
desc "Build everything."
|
113
|
+
task :default => [ :tar, :gem ]
|
data/ToDo
ADDED
@@ -0,0 +1,8 @@
|
|
1
|
+
Text::Format To Do
|
2
|
+
==================
|
3
|
+
* Margin markers: the ability to place markers in the margin when
|
4
|
+
formatting.
|
5
|
+
* Line numbering: the ability to number lines in the margin when
|
6
|
+
formatting, including numering lines by step (every fifth line, etc.).
|
7
|
+
* Email attribution quoting reformatting.
|
8
|
+
* Proportional width support for GUI formatting.
|
data/lib/text/format.rb
ADDED
@@ -0,0 +1,1043 @@
|
|
1
|
+
# :title: Text::Format
|
2
|
+
# :main: Text::Format
|
3
|
+
#--
|
4
|
+
# Text::Format for Ruby
|
5
|
+
# Version 1.0.0
|
6
|
+
#
|
7
|
+
# Copyright (c) 2002 - 2005 Austin Ziegler
|
8
|
+
#
|
9
|
+
# $Id: format.rb,v 1.5 2005/04/20 01:43:55 austin Exp $
|
10
|
+
#++
|
11
|
+
unless defined?(Text)
|
12
|
+
module Text; end
|
13
|
+
end
|
14
|
+
|
15
|
+
# = Introduction
|
16
|
+
#
|
17
|
+
# Text::Format provides the ability to nicely format fixed-width text with
|
18
|
+
# knowledge of the writeable space (number of columns), margins, and
|
19
|
+
# indentation settings.
|
20
|
+
#
|
21
|
+
# Copyright:: Copyright (c) 2002 - 2005 by Austin Ziegler
|
22
|
+
# Version:: 1.0.0
|
23
|
+
# Based On:: Perl
|
24
|
+
# Text::Format[http://search.cpan.org/author/GABOR/Text-Format0.52/lib/Text/Format.pm],
|
25
|
+
# Copyright (c) 1998 G�bor Egressy
|
26
|
+
# Licence:: Ruby's, Perl Artistic, or GPL version 2 (or later)
|
27
|
+
#
|
28
|
+
class Text::Format
|
29
|
+
VERSION = '1.0.0'
|
30
|
+
|
31
|
+
SPACES_RE = %r{\s+}mo.freeze
|
32
|
+
NEWLINE_RE = %r{\n}o.freeze
|
33
|
+
TAB = "\t".freeze
|
34
|
+
NEWLINE = "\n".freeze
|
35
|
+
|
36
|
+
# Global common English abbreviations. More can be added with
|
37
|
+
# #abbreviations.
|
38
|
+
ABBREV = %w(Mr Mrs Ms Jr Sr Dr)
|
39
|
+
|
40
|
+
# Formats text flush to the left margin with a visual and physical
|
41
|
+
# ragged right margin.
|
42
|
+
#
|
43
|
+
# >A paragraph that is<
|
44
|
+
# >left aligned.<
|
45
|
+
LEFT_ALIGN = :left
|
46
|
+
# Formats text flush to the right margin with a visual ragged left
|
47
|
+
# margin. The actual left margin is padded with spaces from the
|
48
|
+
# beginning of the line to the start of the text such that the right
|
49
|
+
# margin will be flush.
|
50
|
+
#
|
51
|
+
# >A paragraph that is<
|
52
|
+
# > right aligned.<
|
53
|
+
RIGHT_ALIGN = :right
|
54
|
+
# Formats text flush to the left margin with a visual ragged right
|
55
|
+
# margin. The line is padded with spaces from the end of the text to the
|
56
|
+
# right margin.
|
57
|
+
#
|
58
|
+
# >A paragraph that is<
|
59
|
+
# >right filled. <
|
60
|
+
RIGHT_FILL = :fill
|
61
|
+
# Formats the text flush to both the left and right margins. The last
|
62
|
+
# line will not be justified if it consists of a single word (it will be
|
63
|
+
# treated as +RIGHT_FILL+ in this case). Spacing between words is
|
64
|
+
# increased to ensure that the textg is flush with both margins.
|
65
|
+
#
|
66
|
+
# |A paragraph that|
|
67
|
+
# |is justified.|
|
68
|
+
#
|
69
|
+
# |A paragraph that is|
|
70
|
+
# |justified. |
|
71
|
+
JUSTIFY = :justify
|
72
|
+
|
73
|
+
# When #hard_margins is enabled, a word that extends over the right
|
74
|
+
# margin will be split at the number of characters needed. This is
|
75
|
+
# similar to how characters wrap on a terminal. This is the default
|
76
|
+
# split mechanism when #hard_margins is enabled.
|
77
|
+
#
|
78
|
+
# repre
|
79
|
+
# senta
|
80
|
+
# ion
|
81
|
+
SPLIT_FIXED = 1
|
82
|
+
# When #hard_margins is enabled, a word that extends over the right
|
83
|
+
# margin will be split at one less than the number of characters needed
|
84
|
+
# with a C-style continuation character (\). If the word cannot be split
|
85
|
+
# using the rules of SPLIT_CONTINUATION, and the word will not fit
|
86
|
+
# wholly into the next line, then SPLIT_FIXED will be used.
|
87
|
+
#
|
88
|
+
# repr\
|
89
|
+
# esen\
|
90
|
+
# tati\
|
91
|
+
# on
|
92
|
+
SPLIT_CONTINUATION = 2
|
93
|
+
# When #hard_margins is enabled, a word that extends over the right
|
94
|
+
# margin will be split according to the hyphenator specified by the
|
95
|
+
# #hyphenator object; if there is no hyphenation library supplied, then
|
96
|
+
# the hyphenator of Text::Format itself is used, which is the same as
|
97
|
+
# SPLIT_CONTINUATION. See #hyphenator for more information about
|
98
|
+
# hyphenation libraries. The example below is valid with either
|
99
|
+
# TeX::Hyphen or Text::Hyphen. If the word cannot be split using the
|
100
|
+
# hyphenator's rules, and the word will not fit wholly into the next
|
101
|
+
# line, then SPLIT_FIXED will be used.
|
102
|
+
#
|
103
|
+
# rep-
|
104
|
+
# re-
|
105
|
+
# sen-
|
106
|
+
# ta-
|
107
|
+
# tion
|
108
|
+
#
|
109
|
+
SPLIT_HYPHENATION = 4
|
110
|
+
# When #hard_margins is enabled, a word that extends over the right
|
111
|
+
# margin will be split at one less than the number of characters needed
|
112
|
+
# with a C-style continuation character (\). If the word cannot be split
|
113
|
+
# using the rules of SPLIT_CONTINUATION, then SPLIT_FIXED will be used.
|
114
|
+
SPLIT_CONTINUATION_FIXED = SPLIT_CONTINUATION | SPLIT_FIXED
|
115
|
+
# When #hard_margins is enabled, a word that extends over the right
|
116
|
+
# margin will be split according to the hyphenator specified by the
|
117
|
+
# #hyphenator object; if there is no hyphenation library supplied, then
|
118
|
+
# the hyphenator of Text::Format itself is used, which is the same as
|
119
|
+
# SPLIT_CONTINUATION. See #hyphenator for more information about
|
120
|
+
# hyphenation libraries. The example below is valid with either
|
121
|
+
# TeX::Hyphen or Text::Hyphen. If the word cannot be split using the
|
122
|
+
# hyphenator's rules, then SPLIT_FIXED will be used.
|
123
|
+
SPLIT_HYPHENATION_FIXED = SPLIT_HYPHENATION | SPLIT_FIXED
|
124
|
+
# Attempts to split words according to the rules of the supplied
|
125
|
+
# hyphenator (e.g., SPLIT_HYPHENATION); if the word cannot be split
|
126
|
+
# using these rules, then the rules of SPLIT_CONTINUATION will be
|
127
|
+
# followed. In all cases, if the word cannot be split using either
|
128
|
+
# SPLIT_HYPHENATION or SPLIT_CONTINUATION, and the word will not fit
|
129
|
+
# wholly into the next line, then SPLIT_FIXED will be used.
|
130
|
+
SPLIT_HYPHENATION_CONTINUATION = SPLIT_HYPHENATION | SPLIT_CONTINUATION
|
131
|
+
# Attempts to split words according to the rules of the supplied
|
132
|
+
# hyphenator (e.g., SPLIT_HYPHENATION); if the word cannot be split
|
133
|
+
# using these rules, then the rules of SPLIT_CONTINUATION will be
|
134
|
+
# followed. In all cases, if the word cannot be split using either
|
135
|
+
# SPLIT_HYPHENATION or SPLIT_CONTINUATION, then SPLIT_FIXED will be
|
136
|
+
# used.
|
137
|
+
SPLIT_ALL = SPLIT_HYPHENATION | SPLIT_CONTINUATION | SPLIT_FIXED
|
138
|
+
|
139
|
+
# Words forcibly split by Text::Format will be stored as split words.
|
140
|
+
# This class represents a word forcibly split.
|
141
|
+
class SplitWord
|
142
|
+
# The word that was split.
|
143
|
+
attr_reader :word
|
144
|
+
# The first part of the word that was split.
|
145
|
+
attr_reader :first
|
146
|
+
# The remainder of the word that was split.
|
147
|
+
attr_reader :rest
|
148
|
+
|
149
|
+
def initialize(word, first, rest)
|
150
|
+
@word = word
|
151
|
+
@first = first
|
152
|
+
@rest = rest
|
153
|
+
end
|
154
|
+
end
|
155
|
+
|
156
|
+
# Indicates punctuation characters that terminates a sentence, as some
|
157
|
+
# English typesetting rules indicate that sentences should be followed
|
158
|
+
# by two spaces. This is an archaic rule, but is supported with
|
159
|
+
# #extra_space. This is the default set of terminal punctuation
|
160
|
+
# characters. Additional terminal punctuation may be added to the
|
161
|
+
# formatting object through #terminal_punctuation.
|
162
|
+
TERMINAL_PUNCTUATION = %q(.?!)
|
163
|
+
# Indicates quote characters that may follow terminal punctuation under
|
164
|
+
# the current formatting rules. This satisfies the English formatting
|
165
|
+
# rule that indicates that sentences terminated inside of quotes should
|
166
|
+
# have the punctuation inside of the quoted text, not outside of the
|
167
|
+
# terminal quote. Additional terminal quotes may be added to the
|
168
|
+
# formatting object through #terminal_quotes. See TERMINAL_PUNCTUATION
|
169
|
+
# for more information.
|
170
|
+
TERMINAL_QUOTES = %q('")
|
171
|
+
|
172
|
+
# This method returns the regular expression used to detect the end of a
|
173
|
+
# sentence under the current definition of TERMINAL_PUNCTUATION,
|
174
|
+
# #terminal_punctuation, TERMINAL_QUOTES, and #terminal_quotes.
|
175
|
+
def __sentence_end_re
|
176
|
+
%r{[#{TERMINAL_PUNCTUATION}#{self.terminal_punctuation}][#{TERMINAL_QUOTES}#{self.terminal_quotes}]?$}
|
177
|
+
end
|
178
|
+
private :__sentence_end_re
|
179
|
+
|
180
|
+
# Returns a regular expression for a set of characters (at least one
|
181
|
+
# non-whitespace followed by at least one space) of the specified size
|
182
|
+
# followed by one or more of any character.
|
183
|
+
RE_BREAK_SIZE = lambda { |size| %r[((?:\S+\s+){#{size}})(.+)] }
|
184
|
+
|
185
|
+
# Compares the formatting rules, excepting #hyphenator, of two
|
186
|
+
# Text::Format objects. Generated results (e.g., #split_words) are not
|
187
|
+
# compared.
|
188
|
+
def ==(o)
|
189
|
+
(@text == o.text) and
|
190
|
+
(@columns == o.columns) and
|
191
|
+
(@left_margin == o.left_margin) and
|
192
|
+
(@right_margin == o.right_margin) and
|
193
|
+
(@hard_margins == o.hard_margins) and
|
194
|
+
(@split_rules == o.split_rules) and
|
195
|
+
(@first_indent == o.first_indent) and
|
196
|
+
(@body_indent == o.body_indent) and
|
197
|
+
(@tag_text == o.tag_text) and
|
198
|
+
(@tabstop == o.tabstop) and
|
199
|
+
(@format_style == o.format_style) and
|
200
|
+
(@extra_space == o.extra_space) and
|
201
|
+
(@tag_paragraph == o.tag_paragraph) and
|
202
|
+
(@nobreak == o.nobreak) and
|
203
|
+
(@terminal_punctuation == o.terminal_punctuation) and
|
204
|
+
(@terminal_quotes == o.terminal_quotes) and
|
205
|
+
(@abbreviations == o.abbreviations) and
|
206
|
+
(@nobreak_regex == o.nobreak_regex)
|
207
|
+
end
|
208
|
+
|
209
|
+
# The default text to be manipulated. Note that value is optional, but
|
210
|
+
# if the formatting functions are called without values, this text is
|
211
|
+
# what will be formatted.
|
212
|
+
#
|
213
|
+
# *Default*:: <tt>[]</tt>
|
214
|
+
# <b>Used in</b>:: All methods
|
215
|
+
attr_accessor :text
|
216
|
+
|
217
|
+
# The total width of the format area. The margins, indentation, and text
|
218
|
+
# are formatted into this space. Any value provided is silently
|
219
|
+
# converted to a positive integer.
|
220
|
+
#
|
221
|
+
# COLUMNS
|
222
|
+
# <-------------------------------------------------------------->
|
223
|
+
# <-----------><------><---------------------------><------------>
|
224
|
+
# left margin indent text is formatted into here right margin
|
225
|
+
#
|
226
|
+
# *Default*:: <tt>72</tt>
|
227
|
+
# <b>Used in</b>:: #format, #paragraphs, #center
|
228
|
+
attr_accessor :columns
|
229
|
+
def columns=(col) #:nodoc:
|
230
|
+
@columns = col.to_i.abs
|
231
|
+
end
|
232
|
+
|
233
|
+
# The number of spaces used for the left margin. The value provided is
|
234
|
+
# silently converted to a positive integer value.
|
235
|
+
#
|
236
|
+
# columns
|
237
|
+
# <-------------------------------------------------------------->
|
238
|
+
# <-----------><------><---------------------------><------------>
|
239
|
+
# LEFT MARGIN indent text is formatted into here right margin
|
240
|
+
#
|
241
|
+
# *Default*:: <tt>0</tt>
|
242
|
+
# <b>Used in</b>:: #format, #paragraphs, #center
|
243
|
+
attr_accessor :left_margin
|
244
|
+
def left_margin=(left) #:nodoc:
|
245
|
+
@left_margin = left.to_i.abs
|
246
|
+
end
|
247
|
+
|
248
|
+
# The number of spaces used for the right margin. The value provided is
|
249
|
+
# silently converted to a positive integer value.
|
250
|
+
#
|
251
|
+
# columns
|
252
|
+
# <-------------------------------------------------------------->
|
253
|
+
# <-----------><------><---------------------------><------------>
|
254
|
+
# left margin indent text is formatted into here RIGHT MARGIN
|
255
|
+
#
|
256
|
+
# *Default*:: <tt>0</tt>
|
257
|
+
# <b>Used in</b>:: #format, #paragraphs, #center
|
258
|
+
attr_accessor :right_margin
|
259
|
+
def right_margin=(right) #:nodoc:
|
260
|
+
@right_margin = right.to_i.abs
|
261
|
+
end
|
262
|
+
|
263
|
+
# The number of spaces to indent the first line of a paragraph. The
|
264
|
+
# value provided is silently converted to a positive integer value.
|
265
|
+
#
|
266
|
+
# columns
|
267
|
+
# <-------------------------------------------------------------->
|
268
|
+
# <-----------><------><---------------------------><------------>
|
269
|
+
# left margin INDENT text is formatted into here right margin
|
270
|
+
#
|
271
|
+
# *Default*:: <tt>4</tt>
|
272
|
+
# <b>Used in</b>:: #format, #paragraphs
|
273
|
+
attr_accessor :first_indent
|
274
|
+
def first_indent=(first) #:nodoc:
|
275
|
+
@first_indent = first.to_i.abs
|
276
|
+
end
|
277
|
+
|
278
|
+
# The number of spaces to indent all lines after the first line of a
|
279
|
+
# paragraph. The value provided is silently converted to a positive
|
280
|
+
# integer value.
|
281
|
+
#
|
282
|
+
# columns
|
283
|
+
# <-------------------------------------------------------------->
|
284
|
+
# <-----------><------><---------------------------><------------>
|
285
|
+
# left margin INDENT text is formatted into here right margin
|
286
|
+
#
|
287
|
+
# *Default*:: <tt>0</tt>
|
288
|
+
# <b>Used in</b>:: #format, #paragraphs
|
289
|
+
attr_accessor :body_indent
|
290
|
+
def body_indent=(body) #:nodoc:
|
291
|
+
@body_indent = body.to_i.abs
|
292
|
+
end
|
293
|
+
|
294
|
+
# Normally, words larger than the format area will be placed on a line
|
295
|
+
# by themselves. Setting this value to +true+ will force words larger
|
296
|
+
# than the format area to be split into one or more "words" each at most
|
297
|
+
# the size of the format area. The first line and the original word will
|
298
|
+
# be placed into #split_words. Note that this will cause the output to
|
299
|
+
# look *similar* to a #format_style of JUSTIFY. (Lines will be filled as
|
300
|
+
# much as possible.)
|
301
|
+
#
|
302
|
+
# *Default*:: +false+
|
303
|
+
# <b>Used in</b>:: #format, #paragraphs
|
304
|
+
attr_accessor :hard_margins
|
305
|
+
|
306
|
+
# An array of words split during formatting if #hard_margins is set to
|
307
|
+
# +true+.
|
308
|
+
# #split_words << Text::Format::SplitWord.new(word, first, rest)
|
309
|
+
attr_reader :split_words
|
310
|
+
|
311
|
+
# The object responsible for hyphenating. It must respond to
|
312
|
+
# #hyphenate_to(word, size) or #hyphenate_to(word, size, formatter) and
|
313
|
+
# return an array of the word split into two parts (e.g., <tt>[part1,
|
314
|
+
# part2]</tt>; if there is a hyphenation mark to be applied,
|
315
|
+
# responsibility belongs to the hyphenator object. The size is the
|
316
|
+
# MAXIMUM size permitted, including any hyphenation marks.
|
317
|
+
#
|
318
|
+
# If the #hyphenate_to method has an arity of 3, the current formatter
|
319
|
+
# (+self+) will be provided to the method. This allows the hyphenator to
|
320
|
+
# make decisions about the hyphenation based on the formatting rules.
|
321
|
+
#
|
322
|
+
# #hyphenate_to should return <tt>[nil, word]</tt> if the word cannot be
|
323
|
+
# hyphenated.
|
324
|
+
#
|
325
|
+
# *Default*:: +self+ (SPLIT_CONTINUATION)
|
326
|
+
# <b>Used in</b>:: #format, #paragraphs
|
327
|
+
attr_accessor :hyphenator
|
328
|
+
def hyphenator=(h) #:nodoc:
|
329
|
+
h ||= self
|
330
|
+
|
331
|
+
raise ArgumentError, "#{h.inspect} is not a valid hyphenator." unless h.respond_to?(:hyphenate_to)
|
332
|
+
arity = h.method(:hyphenate_to).arity
|
333
|
+
raise ArgumentError, "#{h.inspect} must have exactly two or three arguments." unless arity.between?(2, 3)
|
334
|
+
|
335
|
+
@hyphenator = h
|
336
|
+
@hyphenator_arity = arity
|
337
|
+
end
|
338
|
+
|
339
|
+
# Specifies the split mode; used only when #hard_margins is set to
|
340
|
+
# +true+. Allowable values are:
|
341
|
+
#
|
342
|
+
# * +SPLIT_FIXED+
|
343
|
+
# * +SPLIT_CONTINUATION+
|
344
|
+
# * +SPLIT_HYPHENATION+
|
345
|
+
# * +SPLIT_CONTINUATION_FIXED+
|
346
|
+
# * +SPLIT_HYPHENATION_FIXED+
|
347
|
+
# * +SPLIT_HYPHENATION_CONTINUATION+
|
348
|
+
# * +SPLIT_ALL+
|
349
|
+
#
|
350
|
+
# *Default*:: <tt>Text::Format::SPLIT_FIXED</tt>
|
351
|
+
# <b>Used in</b>:: #format, #paragraphs
|
352
|
+
attr_accessor :split_rules
|
353
|
+
def split_rules=(s) #:nodoc:
|
354
|
+
raise ArgumentError, "Invalid value provided for #split_rules." if ((s < SPLIT_FIXED) or (s > SPLIT_ALL))
|
355
|
+
@split_rules = s
|
356
|
+
end
|
357
|
+
|
358
|
+
# Indicates whether sentence terminators should be followed by a single
|
359
|
+
# space (+false+), or two spaces (+true+). See #abbreviations for more
|
360
|
+
# information.
|
361
|
+
#
|
362
|
+
# *Default*:: +false+
|
363
|
+
# <b>Used in</b>:: #format, #paragraphs
|
364
|
+
attr_accessor :extra_space
|
365
|
+
|
366
|
+
# Defines the current abbreviations as an array. This is only used if
|
367
|
+
# extra_space is turned on.
|
368
|
+
#
|
369
|
+
# If one is abbreviating "President" as "Pres." (abbreviations =
|
370
|
+
# ["Pres"]), then the results of formatting will be as illustrated in
|
371
|
+
# the table below:
|
372
|
+
#
|
373
|
+
# abbreviations
|
374
|
+
# extra_space | #include?("Pres") | not #include?("Pres")
|
375
|
+
# ------------+-------------------+----------------------
|
376
|
+
# true | Pres. Lincoln | Pres. Lincoln
|
377
|
+
# false | Pres. Lincoln | Pres. Lincoln
|
378
|
+
# ------------+-------------------+----------------------
|
379
|
+
# extra_space | #include?("Mrs") | not #include?("Mrs")
|
380
|
+
# true | Mrs. Lincoln | Mrs. Lincoln
|
381
|
+
# false | Mrs. Lincoln | Mrs. Lincoln
|
382
|
+
#
|
383
|
+
# Note that abbreviations should not have the terminal period as part of
|
384
|
+
# their definitions.
|
385
|
+
#
|
386
|
+
# This automatic abbreviation handling *will* cause some issues with
|
387
|
+
# uncommon sentence structures. The two sentences below will not be
|
388
|
+
# formatted correctly:
|
389
|
+
#
|
390
|
+
# You're in trouble now, Mr.
|
391
|
+
# Just wait until your father gets home.
|
392
|
+
#
|
393
|
+
# Under no circumstances (because Mr is a predefined abbreviation) will
|
394
|
+
# this ever be separated by two spaces.
|
395
|
+
#
|
396
|
+
# *Default*:: <tt>[]</tt>
|
397
|
+
# <b>Used in</b>:: #format, #paragraphs
|
398
|
+
attr_accessor :abbreviations
|
399
|
+
|
400
|
+
# Specifies additional punctuation characters that terminate a sentence,
|
401
|
+
# as some English typesetting rules indicate that sentences should be
|
402
|
+
# followed by two spaces. This is an archaic rule, but is supported with
|
403
|
+
# #extra_space. This is added to the default set of terminal punctuation
|
404
|
+
# defined in TERMINAL_PUNCTUATION.
|
405
|
+
#
|
406
|
+
# *Default*:: <tt>""</tt>
|
407
|
+
# <b>Used in</b>:: #format, #paragraphs
|
408
|
+
attr_accessor :terminal_punctuation
|
409
|
+
# Specifies additional quote characters that may follow
|
410
|
+
# terminal punctuation under the current formatting rules. This
|
411
|
+
# satisfies the English formatting rule that indicates that sentences
|
412
|
+
# terminated inside of quotes should have the punctuation inside of the
|
413
|
+
# quoted text, not outside of the terminal quote. This is added to the
|
414
|
+
# default set of terminal quotes defined in TERMINAL_QUOTES.
|
415
|
+
#
|
416
|
+
# *Default*:: <tt>""</tt>
|
417
|
+
# <b>Used in</b>:: #format, #paragraphs
|
418
|
+
attr_accessor :terminal_quotes
|
419
|
+
|
420
|
+
# Indicates whether the formatting of paragraphs should be done with
|
421
|
+
# tagged paragraphs. Useful only with #tag_text.
|
422
|
+
#
|
423
|
+
# *Default*:: +false+
|
424
|
+
# <b>Used in</b>:: #format, #paragraphs
|
425
|
+
attr_accessor :tag_paragraph
|
426
|
+
|
427
|
+
# The text to be placed before each paragraph when #tag_paragraph is
|
428
|
+
# +true+. When #format is called, only the first element (#tag_text[0])
|
429
|
+
# is used. When #paragraphs is called, then each successive element
|
430
|
+
# (#tag_text[n]) will be used once, with corresponding paragraphs. If
|
431
|
+
# the tag elements are exhausted before the text is exhausted, then the
|
432
|
+
# remaining paragraphs will not be tagged. Regardless of indentation
|
433
|
+
# settings, a blank line will be inserted between all paragraphs when
|
434
|
+
# #tag_paragraph is +true+.
|
435
|
+
#
|
436
|
+
# The Text::Format package provides three number generators,
|
437
|
+
# Text::Format::Alpha, Text::Format::Number, and Text::Format::Roman to
|
438
|
+
# assist with the numbering of paragraphs.
|
439
|
+
#
|
440
|
+
# *Default*:: <tt>[]</tt>
|
441
|
+
# <b>Used in</b>:: #format, #paragraphs
|
442
|
+
attr_accessor :tag_text
|
443
|
+
|
444
|
+
# Indicates whether or not the non-breaking space feature should be
|
445
|
+
# used.
|
446
|
+
#
|
447
|
+
# *Default*:: +false+
|
448
|
+
# <b>Used in</b>:: #format, #paragraphs
|
449
|
+
attr_accessor :nobreak
|
450
|
+
|
451
|
+
# A hash which holds the regular expressions on which spaces should not
|
452
|
+
# be broken. The hash is set up such that the key is the first word and
|
453
|
+
# the value is the second word.
|
454
|
+
#
|
455
|
+
# For example, if +nobreak_regex+ contains the following hash:
|
456
|
+
#
|
457
|
+
# { %r{Mrs?\.?} => %r{\S+}, %r{\S+} => %r{(?:[SJ])r\.?} }
|
458
|
+
#
|
459
|
+
# Then "Mr. Jones", "Mrs Jones", and "Jones Jr." would not be broken. If
|
460
|
+
# this simple matching algorithm indicates that there should not be a
|
461
|
+
# break at the current end of line, then a backtrack is done until there
|
462
|
+
# are two words on which line breaking is permitted. If two such words
|
463
|
+
# are not found, then the end of the line will be broken *regardless*.
|
464
|
+
# If there is a single word on the current line, then no backtrack is
|
465
|
+
# done and the word is stuck on the end.
|
466
|
+
#
|
467
|
+
# *Default*:: <tt>{}</tt>
|
468
|
+
# <b>Used in</b>:: #format, #paragraphs
|
469
|
+
attr_accessor :nobreak_regex
|
470
|
+
|
471
|
+
# Indicates the number of spaces that a single tab represents. Any value
|
472
|
+
# provided is silently converted to a positive integer.
|
473
|
+
#
|
474
|
+
# *Default*:: <tt>8</tt>
|
475
|
+
# <b>Used in</b>:: #expand, #unexpand,
|
476
|
+
# #paragraphs
|
477
|
+
attr_accessor :tabstop
|
478
|
+
def tabstop=(tabs) #:nodoc:
|
479
|
+
@tabstop = tabs.to_i.abs
|
480
|
+
end
|
481
|
+
|
482
|
+
# Specifies the format style. Allowable values are:
|
483
|
+
# *+LEFT_ALIGN+
|
484
|
+
# *+RIGHT_ALIGN+
|
485
|
+
# *+RIGHT_FILL+
|
486
|
+
# *+JUSTIFY+
|
487
|
+
#
|
488
|
+
# *Default*:: <tt>Text::Format::LEFT_ALIGN</tt>
|
489
|
+
# <b>Used in</b>:: #format, #paragraphs
|
490
|
+
attr_accessor :format_style
|
491
|
+
def format_style=(fs) #:nodoc:
|
492
|
+
raise ArgumentError, "Invalid value provided for format_style." unless [LEFT_ALIGN, RIGHT_ALIGN, RIGHT_FILL, JUSTIFY].include?(fs)
|
493
|
+
@format_style = fs
|
494
|
+
end
|
495
|
+
|
496
|
+
# Indicates that the format style is left alignment.
|
497
|
+
#
|
498
|
+
# *Default*:: +true+
|
499
|
+
# <b>Used in</b>:: #format, #paragraphs
|
500
|
+
def left_align?
|
501
|
+
@format_style == LEFT_ALIGN
|
502
|
+
end
|
503
|
+
|
504
|
+
# Indicates that the format style is right alignment.
|
505
|
+
#
|
506
|
+
# *Default*:: +false+
|
507
|
+
# <b>Used in</b>:: #format, #paragraphs
|
508
|
+
def right_align?
|
509
|
+
@format_style == RIGHT_ALIGN
|
510
|
+
end
|
511
|
+
|
512
|
+
# Indicates that the format style is right fill.
|
513
|
+
#
|
514
|
+
# *Default*:: +false+
|
515
|
+
# <b>Used in</b>:: #format, #paragraphs
|
516
|
+
def right_fill?
|
517
|
+
@format_style == RIGHT_FILL
|
518
|
+
end
|
519
|
+
|
520
|
+
# Indicates that the format style is full justification.
|
521
|
+
#
|
522
|
+
# *Default*:: +false+
|
523
|
+
# <b>Used in</b>:: #format, #paragraphs
|
524
|
+
def justify?
|
525
|
+
@format_style == JUSTIFY
|
526
|
+
end
|
527
|
+
|
528
|
+
# The formatting object itself can be used as a #hyphenator, where the
|
529
|
+
# default implementation of #hyphenate_to implements the conditions
|
530
|
+
# necessary to properly produce SPLIT_CONTINUATION.
|
531
|
+
def hyphenate_to(word, size)
|
532
|
+
if (size - 2) < 0
|
533
|
+
[nil, word]
|
534
|
+
else
|
535
|
+
[word[0 .. (size - 2)] + "\\", word[(size - 1) .. -1]]
|
536
|
+
end
|
537
|
+
end
|
538
|
+
|
539
|
+
# Splits the provided word so that it is in two parts, <tt>word[0 ..
|
540
|
+
# (size - 1)]</tt> and <tt>word[size .. -1]</tt>.
|
541
|
+
def split_word_to(word, size)
|
542
|
+
[word[0 .. (size - 1)], word[size .. -1]]
|
543
|
+
end
|
544
|
+
|
545
|
+
# Formats text into a nice paragraph format. The text is separated into
|
546
|
+
# words and then reassembled a word at a time using the settings of this
|
547
|
+
# Format object.
|
548
|
+
#
|
549
|
+
# If +text+ is +nil+, then the value of #text will be worked on.
|
550
|
+
def format_one_paragraph(text = nil)
|
551
|
+
text ||= @text
|
552
|
+
text = text[0] if text.kind_of?(Array)
|
553
|
+
|
554
|
+
# Convert the provided paragraph to a list of words.
|
555
|
+
words = text.split(SPACES_RE).reverse.reject { |ww| ww.nil? or ww.empty? }
|
556
|
+
|
557
|
+
text = []
|
558
|
+
|
559
|
+
# Find the maximum line width and the initial indent string.
|
560
|
+
# TODO 20050114 - allow the left and right margins to be specified as
|
561
|
+
# strings. If they are strings, then we need to use the sizes of the
|
562
|
+
# strings. Also: allow the indent string to be set manually and
|
563
|
+
# indicate whether the indent string will have a following space.
|
564
|
+
max_line_width = @columns - @first_indent - @left_margin - @right_margin
|
565
|
+
indent_str = ' ' * @first_indent
|
566
|
+
|
567
|
+
first_line = true
|
568
|
+
|
569
|
+
if words.empty?
|
570
|
+
line = []
|
571
|
+
line_size = 0
|
572
|
+
extra_space = false
|
573
|
+
else
|
574
|
+
line = [ words.pop ]
|
575
|
+
line_size = line[-1].size
|
576
|
+
extra_space = __add_extra_space?(line[-1])
|
577
|
+
end
|
578
|
+
|
579
|
+
while next_word = words.pop
|
580
|
+
next_word.strip! unless next_word.nil?
|
581
|
+
new_line_size = (next_word.size + line_size) + 1
|
582
|
+
|
583
|
+
if extra_space
|
584
|
+
if (line[-1] !~ __sentence_end_re)
|
585
|
+
extra_space = false
|
586
|
+
end
|
587
|
+
end
|
588
|
+
|
589
|
+
# Increase the width of the new line if there's a sentence
|
590
|
+
# terminator and we are applying extra_space.
|
591
|
+
new_line_size += 1 if extra_space
|
592
|
+
|
593
|
+
# Will the word fit onto the current line? If so, simply append it
|
594
|
+
# to the end of the line.
|
595
|
+
|
596
|
+
if new_line_size <= max_line_width
|
597
|
+
if line.empty?
|
598
|
+
line << next_word
|
599
|
+
else
|
600
|
+
if extra_space
|
601
|
+
line << " #{next_word}"
|
602
|
+
else
|
603
|
+
line << " #{next_word}"
|
604
|
+
end
|
605
|
+
end
|
606
|
+
else
|
607
|
+
# Forcibly wrap the line if nonbreaking spaces are turned on and
|
608
|
+
# there is a condition where words must be wrapped. If we have
|
609
|
+
# returned more than one word, readjust the word list.
|
610
|
+
line, next_word = __wrap_line(line, next_word) if @nobreak
|
611
|
+
if next_word.kind_of?(Array)
|
612
|
+
if next_word.size > 1
|
613
|
+
words.push(*(next_word.reverse))
|
614
|
+
next_word = words.pop
|
615
|
+
else
|
616
|
+
next_word = next_word[0]
|
617
|
+
end
|
618
|
+
next_word.strip! unless next_word.nil?
|
619
|
+
end
|
620
|
+
|
621
|
+
# Check to see if the line needs to be hyphenated. If a word has a
|
622
|
+
# hyphen in it (e.g., "fixed-width"), then we can ALWAYS wrap at
|
623
|
+
# that hyphenation, even if #hard_margins is not turned on. More
|
624
|
+
# elaborate forms of hyphenation will only be performed if
|
625
|
+
# #hard_margins is turned on. If we have returned more than one
|
626
|
+
# word, readjust the word list.
|
627
|
+
line, new_line_size, next_word = __hyphenate(line, line_size, next_word, max_line_width)
|
628
|
+
if next_word.kind_of?(Array)
|
629
|
+
if next_word.size > 1
|
630
|
+
words.push(*(next_word.reverse))
|
631
|
+
next_word = words.pop
|
632
|
+
else
|
633
|
+
next_word = next_word[0]
|
634
|
+
end
|
635
|
+
next_word.strip! unless next_word.nil?
|
636
|
+
end
|
637
|
+
|
638
|
+
text << __make_line(line, indent_str, max_line_width, next_word.nil?) unless line.nil?
|
639
|
+
|
640
|
+
if first_line
|
641
|
+
first_line = false
|
642
|
+
max_line_width = @columns - @body_indent - @left_margin - @right_margin
|
643
|
+
indent_str = ' ' * @body_indent
|
644
|
+
end
|
645
|
+
|
646
|
+
if next_word.nil?
|
647
|
+
line = []
|
648
|
+
new_line_size = 0
|
649
|
+
else
|
650
|
+
line = [ next_word ]
|
651
|
+
new_line_size = next_word.size
|
652
|
+
end
|
653
|
+
end
|
654
|
+
|
655
|
+
line_size = new_line_size
|
656
|
+
extra_space = __add_extra_space?(next_word) unless next_word.nil?
|
657
|
+
end
|
658
|
+
|
659
|
+
loop do
|
660
|
+
break if line.nil? or line.empty?
|
661
|
+
line, line_size, ww = __hyphenate(line, line_size, ww, max_line_width)#if @hard_margins
|
662
|
+
text << __make_line(line, indent_str, max_line_width, ww.nil?)
|
663
|
+
line = ww
|
664
|
+
ww = nil
|
665
|
+
end
|
666
|
+
|
667
|
+
if (@tag_paragraph and (not text.empty?))
|
668
|
+
if @tag_cur.nil? or @tag_cur.empty?
|
669
|
+
@tag_cur = @tag_text[0]
|
670
|
+
end
|
671
|
+
|
672
|
+
fchar = /(\S)/o.match(text[0])[1]
|
673
|
+
white = text[0].index(fchar)
|
674
|
+
|
675
|
+
unless @tag_cur.nil?
|
676
|
+
if ((white - @left_margin - 1) > @tag_cur.size) then
|
677
|
+
white = @tag_cur.size + @left_margin
|
678
|
+
text[0].gsub!(/^ {#{white}}/, "#{' ' * @left_margin}#{@tag_cur}")
|
679
|
+
else
|
680
|
+
text.unshift("#{' ' * @left_margin}#{@tag_cur}\n")
|
681
|
+
end
|
682
|
+
end
|
683
|
+
end
|
684
|
+
|
685
|
+
text.join('')
|
686
|
+
end
|
687
|
+
alias format format_one_paragraph
|
688
|
+
|
689
|
+
# Considers each element of text (provided or internal) as a paragraph.
|
690
|
+
# If #first_indent is the same as #body_indent, then paragraphs will be
|
691
|
+
# separated by a single empty line in the result; otherwise, the
|
692
|
+
# paragraphs will follow immediately after each other. Uses #format to
|
693
|
+
# do the heavy lifting.
|
694
|
+
#
|
695
|
+
# If +to_wrap+ responds to #split, then it will be split into an array
|
696
|
+
# of elements by calling #split with the value of +split_on+. The
|
697
|
+
# default value of split_on is $/, or the default record separator,
|
698
|
+
# repeated twice (e.g., /\n\n/).
|
699
|
+
def paragraphs(to_wrap = nil, split_on = /(#{$/}){2}/o)
|
700
|
+
to_wrap = @text if to_wrap.nil?
|
701
|
+
if to_wrap.respond_to?(:split)
|
702
|
+
to_wrap = to_wrap.split(split_on)
|
703
|
+
else
|
704
|
+
to_wrap = [to_wrap].flatten
|
705
|
+
end
|
706
|
+
|
707
|
+
if ((@first_indent == @body_indent) or @tag_paragraph) then
|
708
|
+
p_end = NEWLINE
|
709
|
+
else
|
710
|
+
p_end = ''
|
711
|
+
end
|
712
|
+
|
713
|
+
cnt = 0
|
714
|
+
ret = []
|
715
|
+
to_wrap.each do |tw|
|
716
|
+
@tag_cur = @tag_text[cnt] if @tag_paragraph
|
717
|
+
@tag_cur = '' if @tag_cur.nil?
|
718
|
+
line = format(tw)
|
719
|
+
ret << "#{line}#{p_end}" if (not line.nil?) and (line.size > 0)
|
720
|
+
cnt += 1
|
721
|
+
end
|
722
|
+
|
723
|
+
ret[-1].chomp! unless ret.empty?
|
724
|
+
ret.join('')
|
725
|
+
end
|
726
|
+
|
727
|
+
# Centers the text, preserving empty lines and tabs.
|
728
|
+
def center(to_center = nil)
|
729
|
+
to_center = @text if to_center.nil?
|
730
|
+
to_center = [to_center].flatten
|
731
|
+
|
732
|
+
tabs = 0
|
733
|
+
width = @columns - @left_margin - @right_margin
|
734
|
+
centered = []
|
735
|
+
to_center.each do |tc|
|
736
|
+
s = tc.strip
|
737
|
+
tabs = s.count(TAB)
|
738
|
+
tabs = 0 if tabs.nil?
|
739
|
+
ct = ((width - s.size - (tabs * @tabstop) + tabs) / 2)
|
740
|
+
ct = (width - @left_margin - @right_margin) - ct
|
741
|
+
centered << "#{s.rjust(ct)}\n"
|
742
|
+
end
|
743
|
+
centered.join('')
|
744
|
+
end
|
745
|
+
|
746
|
+
# Replaces all tab characters in the text with #tabstop spaces.
|
747
|
+
def expand(to_expand = nil)
|
748
|
+
to_expand = @text if to_expand.nil?
|
749
|
+
|
750
|
+
tmp = ' ' * @tabstop
|
751
|
+
changer = lambda do |text|
|
752
|
+
res = text.split(NEWLINE_RE)
|
753
|
+
res.collect! { |ln| ln.gsub!(/\t/o, tmp) }
|
754
|
+
res.join(NEWLINE)
|
755
|
+
end
|
756
|
+
|
757
|
+
if to_expand.kind_of?(Array)
|
758
|
+
to_expand.collect { |te| changer[te] }
|
759
|
+
else
|
760
|
+
changer[to_expand]
|
761
|
+
end
|
762
|
+
end
|
763
|
+
|
764
|
+
# Replaces all occurrences of #tabstop consecutive spaces with a tab
|
765
|
+
# character.
|
766
|
+
def unexpand(to_unexpand = nil)
|
767
|
+
to_unexpand = @text if to_unexpand.nil?
|
768
|
+
|
769
|
+
tmp = / {#{@tabstop}}/
|
770
|
+
changer = lambda do |text|
|
771
|
+
res = text.split(NEWLINE_RE)
|
772
|
+
res.collect! { |ln| ln.gsub!(tmp, TAB) }
|
773
|
+
res.join(NEWLINE)
|
774
|
+
end
|
775
|
+
|
776
|
+
if to_unexpand.kind_of?(Array)
|
777
|
+
to_unexpand.collect { |tu| changer[tu] }
|
778
|
+
else
|
779
|
+
changer[to_unexpand]
|
780
|
+
end
|
781
|
+
end
|
782
|
+
|
783
|
+
# Return +true+ if the word may have an extra space added after it. This
|
784
|
+
# will only be the case if #extra_space is +true+ and the word is not an
|
785
|
+
# abbreviation.
|
786
|
+
def __add_extra_space?(word)
|
787
|
+
return false unless @extra_space
|
788
|
+
word = word.gsub(/\.$/o, '') unless word.nil?
|
789
|
+
return false if ABBREV.include?(word)
|
790
|
+
return false if @abbreviations.include?(word)
|
791
|
+
true
|
792
|
+
end
|
793
|
+
private :__add_extra_space?
|
794
|
+
|
795
|
+
def __make_line(line, indent, width, last = false) #:nodoc:
|
796
|
+
line_size = line.inject(0) { |ls, el| ls + el.size }
|
797
|
+
lmargin = " " * @left_margin
|
798
|
+
fill = " " * (width - line_size) if right_fill? and (line_size <= width)
|
799
|
+
|
800
|
+
unless last
|
801
|
+
if justify? and (line.size > 1)
|
802
|
+
spaces = width - line_size
|
803
|
+
word_spaces = spaces / (line.size / 2)
|
804
|
+
spaces = spaces % (line.size / 2) if word_spaces > 0
|
805
|
+
line.reverse.each do |word|
|
806
|
+
next if (word =~ /^\S/o)
|
807
|
+
|
808
|
+
word.sub!(/^/o, " " * word_spaces)
|
809
|
+
|
810
|
+
next unless (spaces > 0)
|
811
|
+
|
812
|
+
word.sub!(/^/o, " ")
|
813
|
+
spaces -= 1
|
814
|
+
end
|
815
|
+
end
|
816
|
+
end
|
817
|
+
|
818
|
+
line = "#{lmargin}#{indent}#{line.join('')}#{fill}\n" unless line.empty?
|
819
|
+
|
820
|
+
if right_align? and (not line.nil?)
|
821
|
+
line.sub(/^/o, " " * (@columns - @right_margin - (line.size - 1)))
|
822
|
+
else
|
823
|
+
line
|
824
|
+
end
|
825
|
+
end
|
826
|
+
# private :__make_line
|
827
|
+
|
828
|
+
def __hyphenate(line, line_size, next_word, width) #:nodoc:
|
829
|
+
return [ line, line_size, next_word ] if line.nil? or line.empty?
|
830
|
+
rline = line.dup
|
831
|
+
rsize = line_size
|
832
|
+
|
833
|
+
rnext = []
|
834
|
+
rnext << next_word.dup unless next_word.nil?
|
835
|
+
|
836
|
+
loop do
|
837
|
+
break if rnext.nil? or rline.nil?
|
838
|
+
|
839
|
+
if rsize == width
|
840
|
+
break
|
841
|
+
elsif rsize > width
|
842
|
+
word = rline.pop
|
843
|
+
size = width - rsize + word.size
|
844
|
+
|
845
|
+
if (size < 1)
|
846
|
+
rnext.unshift word
|
847
|
+
next
|
848
|
+
end
|
849
|
+
|
850
|
+
first = rest = nil
|
851
|
+
|
852
|
+
# TODO: Add the check to see if the word contains a hyphen to
|
853
|
+
# split on automatically.
|
854
|
+
# Does the word already have a hyphen in it? If so, try to use
|
855
|
+
# that to split the word.
|
856
|
+
# if word.index('-') < size
|
857
|
+
# first = word[0 ... word.index("-")]
|
858
|
+
# rest = word[word.index("-") .. -1]
|
859
|
+
# end
|
860
|
+
|
861
|
+
if @hard_margins
|
862
|
+
if first.nil? and (@split_rules & SPLIT_HYPHENATION) == SPLIT_HYPHENATION
|
863
|
+
if @hyphenator_arity == 2
|
864
|
+
first, rest = @hyphenator.hyphenate_to(word, size)
|
865
|
+
else
|
866
|
+
first, rest = @hyphenator.hyphenate_to(word, size, self)
|
867
|
+
end
|
868
|
+
end
|
869
|
+
|
870
|
+
if first.nil? and (@split_rules & SPLIT_CONTINUATION) == SPLIT_CONTINUATION
|
871
|
+
first, rest = self.hyphenate_to(word, size)
|
872
|
+
end
|
873
|
+
|
874
|
+
if first.nil?
|
875
|
+
if (@split_rules & SPLIT_FIXED) == SPLIT_FIXED
|
876
|
+
first, rest = split_word_to(word, size)
|
877
|
+
elsif (not rest.nil? and (rest.size > size))
|
878
|
+
first, rest = split_word_to(word, size)
|
879
|
+
end
|
880
|
+
end
|
881
|
+
else
|
882
|
+
first = word if first.nil?
|
883
|
+
end
|
884
|
+
|
885
|
+
if first.nil?
|
886
|
+
rest = word
|
887
|
+
else
|
888
|
+
rsize = rsize - word.size + first.size
|
889
|
+
if rline.empty?
|
890
|
+
rline << first
|
891
|
+
else
|
892
|
+
rsize += 1
|
893
|
+
rline << " #{first}"
|
894
|
+
end
|
895
|
+
@split_words << SplitWord.new(word, first, rest)
|
896
|
+
end
|
897
|
+
rnext.unshift rest unless rest.nil?
|
898
|
+
break
|
899
|
+
else
|
900
|
+
break if rnext.empty?
|
901
|
+
word = rnext.shift.dup
|
902
|
+
size = width - rsize - 1
|
903
|
+
|
904
|
+
if (size <= 0)
|
905
|
+
rnext.unshift word
|
906
|
+
break
|
907
|
+
end
|
908
|
+
|
909
|
+
first = rest = nil
|
910
|
+
|
911
|
+
# TODO: Add the check to see if the word contains a hyphen to
|
912
|
+
# split on automatically.
|
913
|
+
# Does the word already have a hyphen in it? If so, try to use
|
914
|
+
# that to split the word.
|
915
|
+
# if word.index('-') < size
|
916
|
+
# first = word[0 ... word.index("-")]
|
917
|
+
# rest = word[word.index("-") .. -1]
|
918
|
+
# end
|
919
|
+
|
920
|
+
if @hard_margins
|
921
|
+
if (@split_rules & SPLIT_HYPHENATION) == SPLIT_HYPHENATION
|
922
|
+
if @hyphenator_arity == 2
|
923
|
+
first, rest = @hyphenator.hyphenate_to(word, size)
|
924
|
+
else
|
925
|
+
first, rest = @hyphenator.hyphenate_to(word, size, self)
|
926
|
+
end
|
927
|
+
end
|
928
|
+
|
929
|
+
if first.nil? and (@split_rules & SPLIT_CONTINUATION) == SPLIT_CONTINUATION
|
930
|
+
first, rest = self.hyphenate_to(word, size)
|
931
|
+
end
|
932
|
+
|
933
|
+
if first.nil?
|
934
|
+
if (@split_rules & SPLIT_FIXED) == SPLIT_FIXED
|
935
|
+
first, rest = split_word_to(word, size)
|
936
|
+
elsif (not rest.nil? and (rest.size > width))
|
937
|
+
first, rest = split_word_to(word, size)
|
938
|
+
end
|
939
|
+
end
|
940
|
+
else
|
941
|
+
first = word if first.nil?
|
942
|
+
end
|
943
|
+
|
944
|
+
# The word was successfully split. Does it fit?
|
945
|
+
unless first.nil?
|
946
|
+
if (rsize + first.size) < width
|
947
|
+
@split_words << SplitWord.new(word, first, rest)
|
948
|
+
|
949
|
+
rsize += first.size + 1
|
950
|
+
rline << " #{first}"
|
951
|
+
else
|
952
|
+
rest = word
|
953
|
+
end
|
954
|
+
else
|
955
|
+
rest = word unless rest.nil?
|
956
|
+
end
|
957
|
+
|
958
|
+
rnext.unshift rest
|
959
|
+
break
|
960
|
+
end
|
961
|
+
end
|
962
|
+
[ rline, rsize, rnext ]
|
963
|
+
end
|
964
|
+
private :__hyphenate
|
965
|
+
|
966
|
+
# The line must be broken. Typically, this is done by moving the last
|
967
|
+
# word on the current line to the next line. However, it may be possible
|
968
|
+
# that certain combinations of words may not be broken (see
|
969
|
+
# #nobreak_regex for more information). Therefore, it may be necessary
|
970
|
+
# to move multiple words from the current line to the next line. This
|
971
|
+
# function does this.
|
972
|
+
def __wrap_line(line, next_word)
|
973
|
+
no_break = false
|
974
|
+
|
975
|
+
word_index = line.size - 1
|
976
|
+
|
977
|
+
@nobreak_regex.each_pair do |first, second|
|
978
|
+
if line[word_index] =~ first and next_word =~ second
|
979
|
+
no_break = true
|
980
|
+
end
|
981
|
+
end
|
982
|
+
|
983
|
+
# If the last word and the next word aren't to be broken, and the line
|
984
|
+
# has more than one word in it, then we need to go back by words to
|
985
|
+
# ensure that we break as allowed.
|
986
|
+
if no_break and word_index.nonzero?
|
987
|
+
word_index -= 1
|
988
|
+
|
989
|
+
while word_index.nonzero?
|
990
|
+
no_break = false
|
991
|
+
@nobreak_regex.each_pair { |first, second|
|
992
|
+
if line[word_index] =~ first and line[word_index + 1] =~ second
|
993
|
+
no_break = true
|
994
|
+
end
|
995
|
+
}
|
996
|
+
|
997
|
+
break unless no_break
|
998
|
+
word_index -= 1
|
999
|
+
end
|
1000
|
+
|
1001
|
+
if word_index.nonzero?
|
1002
|
+
words = line.slice!(word_index .. -1)
|
1003
|
+
words << next_word
|
1004
|
+
end
|
1005
|
+
end
|
1006
|
+
|
1007
|
+
[line, words]
|
1008
|
+
end
|
1009
|
+
private :__wrap_line
|
1010
|
+
|
1011
|
+
# Create a Text::Format object. Accepts an optional hash of construction
|
1012
|
+
# options (this will be changed to named paramters in Ruby 2.0). After
|
1013
|
+
# the initial object is constructed (with either the provided or default
|
1014
|
+
# values), the object will be yielded (as +self+) to an optional block
|
1015
|
+
# for further construction and operation.
|
1016
|
+
def initialize(options = {}) #:yields self:
|
1017
|
+
@text = options[:text] || []
|
1018
|
+
@columns = options[:columns] || 72
|
1019
|
+
@tabstop = options[:tabstop] || 8
|
1020
|
+
@first_indent = options[:first_indent] || 4
|
1021
|
+
@body_indent = options[:body_indent] || 0
|
1022
|
+
@format_style = options[:format_style] || LEFT_ALIGN
|
1023
|
+
@left_margin = options[:left_margin] || 0
|
1024
|
+
@right_margin = options[:right_margin] || 0
|
1025
|
+
@extra_space = options[:extra_space] || false
|
1026
|
+
@tag_paragraph = options[:tag_paragraph] || false
|
1027
|
+
@tag_text = options[:tag_text] || []
|
1028
|
+
@abbreviations = options[:abbreviations] || []
|
1029
|
+
@terminal_punctuation = options[:terminal_punctuation] || ""
|
1030
|
+
@terminal_quotes = options[:terminal_quotes] || ""
|
1031
|
+
@nobreak = options[:nobreak] || false
|
1032
|
+
@nobreak_regex = options[:nobreak_regex] || {}
|
1033
|
+
@hard_margins = options[:hard_margins] || false
|
1034
|
+
@split_rules = options[:split_rules] || SPLIT_FIXED
|
1035
|
+
@hyphenator = options[:hyphenator] || self
|
1036
|
+
|
1037
|
+
@hyphenator_arity = @hyphenator.method(:hyphenate_to).arity
|
1038
|
+
@tag_cur = ""
|
1039
|
+
@split_words = []
|
1040
|
+
|
1041
|
+
yield self if block_given?
|
1042
|
+
end
|
1043
|
+
end
|