RubyGems - plain_text - Versions diffs - 0.6 → 0.7 - Mend

plain_text 0.6 → 0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

checksums.yaml +4 -4
data/ChangeLog +13 -0
data/Makefile +1 -1
data/README.en.rdoc +19 -19
data/bin/yard2md_afterclean +14 -3
data/lib/plain_text/parse_rule.rb +3 -3
data/lib/plain_text/part.rb +7 -8
data/lib/plain_text/split.rb +9 -8
data/lib/plain_text/util.rb +6 -6
data/lib/plain_text.rb +34 -34
data/plain_text.gemspec +3 -3
data/test/testyard2md_afterclean.rb +38 -2
metadata +12 -12

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: daeda100de65b1faf0ed6c93ba5cb698761d9ac6bd0c9e7ed896832ce04598e7
-  data.tar.gz: 7d2edcae09978ed8d1630e9526d4ef2611a96da0ecd8c1d859936dd266f78619
+  metadata.gz: 21f798fe1e22424b48114466382f56a8c27a065ee63d6e9c68c98f5c7e505f14
+  data.tar.gz: 0dde006503a336e1e96960dedd7e04c09ea88723495df93ca11bfa177d0f390e
 SHA512:
-  metadata.gz: d2ffb6c2482e623dfa077c893417e64648be555c8eb8f4fc141f8a0b108839e6e041d15527d5564b2339087edc26d49eef9e40828c00233369b560f12c1c70db
-  data.tar.gz: 164757b808185ebff8ab4e24bbef211c1afb46634e6156cc27bd237bb301bfc3323f2b8ea988721720e0c6b3f83ec560112f1324ebf53ab8fdc314821e3f7d9a
+  metadata.gz: 30a1f8819371a6b2204df7e47b671c95b227fe14d8b134373c3d0768e35e89bd0c3386707cebbc499aab026631aa0b6fb3838112f198a46ae35328e19ab66eec
+  data.tar.gz: baca6464f9e66e01154fe72c7fee869ae8218f4a65f46679b2d9f934780eeef0f893f7d01c3a417a9beb007c390ce0b6c7f3aa40b9c49130ca4dcc5a40c2ba4e

data/ChangeLog CHANGED Viewed

@@ -1,3 +1,16 @@
+-----
+(Version: 0.7)
+2022-08-25  Masa Sakano
+  * fixed many yard-doc warnings.
+-----
+2022-08-25  Masa Sakano
+  * Now auto-judges languages. Fixed a bug of chopping some tails.
+-----
+2019-11-07  Masa Sakano
+  * Modified .github/README.md
 -----
 (Version: 0.6)
 2019-11-07  Masa Sakano

data/Makefile CHANGED Viewed

@@ -20,5 +20,5 @@ test:
 ## yard2md_afterclean in Gem plain_text https://rubygems.org/gems/plain_text
 doc:
-	yard doc; [[ -x ".github" && ( "README.en.rdoc" -nt ".github/README.md" ) ]] && ( ruby -r rdoc -e 'puts RDoc::Markup::ToMarkdown.new.convert ARGF.read' < README.en.rdoc | yard2md_afterclean > .github/.README.md && mv .github/.README.md .github/README.md && echo ".github/README.md is updated." ) || exit 0
+	yard doc; [[ -x ".github" && ( "README.en.rdoc" -nt ".github/README.md" ) ]] && ( ruby -r rdoc -e 'puts RDoc::Markup::ToMarkdown.new.convert ARGF.read' < README.en.rdoc | yard2md_afterclean | ruby -e 'puts ARGF.read.sub(/(```)ruby(\nPart )/){$$1+"text"+$$2}' > .github/.README.md && mv .github/.README.md .github/README.md && echo ".github/README.md is updated." ) || exit 0

data/README.en.rdoc CHANGED Viewed

@@ -9,7 +9,7 @@ which represent the logical structure of a document and another class
 ParseRule, which describes the rules to parse plain text to produce a Part-type Ruby instance.
 This package also provides a few command-line programs, such as counting the number
 of characters (especially useful for documents in Asian (CJK)
-chatacters) and advanced head/tail commands.
+characters) and advanced head/tail commands.
 The master of this README file, as well as the document for all the methods, is found in
 {RubyGems/plain_text}[https://rubygems.org/gems/plain_text]
@@ -119,7 +119,7 @@ Counts the number of characters in a file(s) or STDIN.
 The simplest example to run the command-line script is
-   countchar YourFile.txt
+   % countchar YourFile.txt
 === textclean
@@ -132,30 +132,30 @@ into 2.  See the reference of {PlainText.clean_text} for detail.
 This gives advanced functions, in addition to the standard +head+, including
 Regexp:: It can accept Ruby Regexp to determine the boundary (beginning to the first-matched line), including ignore-case, multi-line, extra *padding-line* etc.
-Character-based:: With +--char+ option, it handles the file in units of a chracter, which is especially handy to deal with multi-byte characters like UTF-8.
-Reverse:: It can *reverese* the behaviour - inverse the counting to ouput everything but initial NUM lines.
+Character-based:: With +--char+ option, it handles the file in units of a character, which is especially handy to deal with multi-byte characters like UTF-8.
+Reverse:: It can *reverse* the behaviour - inverse the counting to output everything but initial NUM lines.
 A few examples are
-  head.rb -n 5 < try.txt
+  % head.rb -n 5 < try.txt
     # the same as the UNIX head; printing the first 5 lines
-  head.rb -i -n 5 try.txt
+  % head.rb -i -n 5 try.txt
     # printing everything but the first 5 lines
     # The same as the UNIX command:  tail -n +5
-  head.rb -e '^===+' try.txt
+  % head.rb -e '^===+' try.txt
     # => from the top up to the line that begins with more than 3 "="
-  head.rb -x -e '^===+' try.txt
+  % head.rb -x -e '^===+' try.txt
     # => from the top up to the line before what begins with more than 3 "="
-  head.rb -e '^===+' -p 3 try.txt
+  % head.rb -e '^===+' -p 3 try.txt
     # => from the top up to 3 lines after what begins with more than 3 "="
-  head.rb -e '([a-z])\1$' --padding=-2 try.txt
+  % head.rb -e '([a-z])\1$' --padding=-2 try.txt
     # => from the top up to 2 lines before what ends with 2
-    #    consecutive same letters (case-insentive) like "AA" or "qQ"
+    #    consecutive same letters (case-insensitive) like "AA" or "qQ"
 The suffix +.rb+ is used to distinguish this command from the UNIX-shell standard command.
@@ -164,18 +164,18 @@ The suffix +.rb+ is used to distinguish this command from the UNIX-shell standar
 This gives advanced functions, in addition to the standard +tail+, including
 Regexp:: It can accept Ruby Regexp to determine the boundary (last-matched line to the end), including ignore-case, multi-line, extra *padding-line* etc.
-Character-based:: With +--char+ option, it handles the file in units of a chracter, which is especially handy to deal with multi-byte characters like UTF-8.
-Reverse:: It can *reverese* the behaviour - inverse the counting to ouput everything but the last NUM lines.
+Character-based:: With +--char+ option, it handles the file in units of a character, which is especially handy to deal with multi-byte characters like UTF-8.
+Reverse:: It can *reverse* the behaviour - inverse the counting to output everything but the last NUM lines.
 See +head.rb+ for practical examples.
 Note the UNIX form of
-  tail -n +5
+  % tail -n +5
-(which I think is a bit counter-intuieive format) is equivalent to
+(which I think is a bit counter-intuitive format) is equivalent to
-  head.rb -i -n 5
+  % head.rb -i -n 5
 The suffix +.rb+ is used to distinguish this command from the UNIX-shell standard command.
@@ -185,7 +185,7 @@ This stands for "yard to markdown - after-clean".
 The standard conversion way of RDoc (written for yard) with +rdoc+ library
-RDoc::Markup::ToMarkdown.new.convert
+  RDoc::Markup::ToMarkdown.new.convert
 is limited, with the produced markdown having a fair number of flaws.
 This command tries to botch-fix it.  The result is
@@ -222,7 +222,7 @@ Work in progress...
 == Install
 This script requires {Ruby}[http://www.ruby-lang.org] Version 2.0
-or above (possibley 2.2 or above?).
+or above (possibly 2.2 or above?).
 For use of the library, if your Ruby script declares
@@ -243,7 +243,7 @@ You may need to modify the first line (Shebang line) of the script to suit your
 environment (it should be unnecessary for Linux and MacOS), or run it
 explicitly with your Ruby command as
-   Prompt% /YOUR/ENV/ruby /YOUR/INSTALLED/countchar
+   % /YOUR/ENV/ruby /YOUR/INSTALLED/countchar
 == Developer's note

data/bin/yard2md_afterclean CHANGED Viewed

@@ -8,6 +8,7 @@ require 'plain_text'
 BANNER = <<"__EOF__"
 USAGE: #{File.basename($0)} [options] [INFILE.txt] < STDIN
   Clean the partially ill-formated (Github) Markdown converted from yard-Rdoc.
+  Create <dl>, fix "+", add code-block languages etc.
 __EOF__
 # Initialising the hash for the command-line options.
@@ -25,7 +26,7 @@ OPTS = {
 #
 def handle_argv
   opt = OptionParser.new(BANNER)
-  opt.on(  '--lang=LANGUAGE', sprintf("Programming Language like ruby (Def: %s).", OPTS[:lang])) { |v| OPTS[:lang]=v.strip }
+  opt.on(  '--lang=LANGUAGE', sprintf("Programming Language like ruby (Def: %s).", OPTS[:lang]), '  NOTE: blocks starting with "% " => sh, "<[a-z]" => HTML in default.') { |v| OPTS[:lang]=v.strip }
   # opt.on(  '--version', "Display the version and exits.", TrueClass) {|v| OPTS[:version] = v}  # Consider opts.on_tail
   opt.on(  '--[no-]debug', "Debug (Def: false)", TrueClass) {|v| OPTS[:debug] = v}
   # opt.separator ""        # Way to control a help message.
@@ -65,7 +66,7 @@ end
 def fix_def_list(str)
   str.gsub(/^(\S+[^\n]*)\n:((?:\s+[^\n]+(?:\n|\z))+)/m){
     sdt, sdd = $1, $2
-    "<dt>%s</dt>\n<dd>%s</dd>\n"%[remove_mdfmt_raw(sdt), remove_mdfmt(sdd.chop)]
+    "<dt>%s</dt>\n<dd>%s</dd>\n"%[remove_mdfmt_raw(sdt), remove_mdfmt(sdd.chomp)]
   }.gsub(/(\s+\n|\A)(<dt>)/m, '\1<dl>'+"\n"+'\2').gsub(%r@(</dd>[[:blank:]]*)(\n(?:\s+|\z))@, '\1'+"\n"+'</dl>\2')
 end
@@ -195,6 +196,7 @@ mdpara.merge_para_if{ |pbp, _, _|
   false
 }
+## Add a programming language to each code block.
 indent_next = 0
 mdpara = mdpara.map_para{|ec|
   indent_prev = indent_next
@@ -202,7 +204,16 @@ mdpara = mdpara.map_para{|ec|
   next fix_string_based(ec) if !md_code_block?(ec, indent_prev)
   inde = " "*indent_prev
   st = ec.gsub(/^    /, '')
-  "%s```%s\n%s\n%s```"%[inde, opts[:lang], st, inde, opts[:lang]]
+  lang =
+    if (/\A\s*<[a-z]/i =~ st) && /^(javascript|x?html|xml|rss|xsd|wsdl)$/ !~ opts[:lang].downcase.strip
+      'html'
+    elsif (/\A\s*[%\$] /i =~ st) && /^(bash|zsh|shell-script|tex|latex)$/ !~ opts[:lang].downcase.strip
+      # NOTE: "postscr" (PostScript) starts from "%!PS" with no spaces in between.
+      'sh'
+    else
+      opts[:lang]
+    end
+  "%s```%s\n%s\n%s```"%[inde, lang, st, inde]
 }
 puts mdpara.join('')

data/lib/plain_text/parse_rule.rb CHANGED Viewed

@@ -122,7 +122,7 @@ module PlainText
     # Optionally, when a non-Array argument or block is given, a name can be specified as the human-readable name for the rule.
     #
     # @option rule [ParseRule, Array, Regexp, Proc]
-    # @param name: [String, Symbol]
+    # @param name [String, Symbol]
     #
     # @yield [inprm] Block to register.
     # @yieldparam [String, Array<Part, Paragraph, Boundary>, Part] inprm Input String/Part/Array to apply the rule to.
@@ -221,7 +221,7 @@ module PlainText
     # Optionally, providing non-Array argument or block is given, a name can be specified as the human-readable name for the rule.
     #
     # @option *rule [Regexp, Proc]
-    # @param name: [String, Symbol, NilClass, Array<String, Symbol, NilClass>]  Array is not supported, yet.
+    # @param name [String, Symbol, NilClass, Array<String, Symbol, NilClass>]  Array is not supported, yet.
     # @return [self]
     #
     # @yield [inprm] Block to register.
@@ -398,7 +398,7 @@ module PlainText
     #     #=> ["abc", "==", "DEF", "==\n"])
     #
     # @param inprm [String, Array, PlainText::Part]
-    # @param index: [Array, Range, Integer, String, Symbol] If given, the rule(s) at the given index (indices) or key(s) only are applied in the given order.
+    # @param index [Array, Range, Integer, String, Symbol] If given, the rule(s) at the given index (indices) or key(s) only are applied in the given order.
     # @return [Array] array of String, Paragraph, Boundary, Array, Part, etc
     def apply(inprm, index: nil, from_string: true, from_array: true)
       allrules = (index ? rules_at(index) : @rules)

data/lib/plain_text/part.rb CHANGED Viewed

@@ -328,7 +328,7 @@ module PlainText
     # @overload set(range)
     #   With a range of the indices to merge. Unless use_para_index is true, this means the main Array index. See the first overload set about it.
     #   @param range [Range] describe value param
-    # @param use_para_index: [Boolean] If false (Default), the indices are for the main indices (alternative between Paras and Boundaries, starting from Para). If true, the indices are as obtained with {#paras}, namely the array containing only Paras.
+    # @param use_para_index [Boolean] If false (Default), the indices are for the main indices (alternative between Paras and Boundaries, starting from Para). If true, the indices are as obtained with {#paras}, namely the array containing only Paras.
     # @return [self, nil] nil if nothing is merged (because of wrong indices).
     def merge_para!(*rest, use_para_index: false)
 $myd = true
@@ -348,7 +348,7 @@ $myd = true
     # while Boundary(5) stays as it is.
     #
     # @param (see #merge_para!)
-    # @param use_para_index: [Boolean] false
+    # @param use_para_index [Boolean] false
     # @return [Range, nil] nil if no range is selected.
     def build_index_range_for_merge_para!(*rest, use_para_index: false)
 #warn "DEBUG:b0: #{rest.inspect} to_a=#{to_a}\n"
@@ -460,10 +460,9 @@ $myd = false
     # Reparses self or a part of it.
     #
-    # @param str [String]
-    # @option rule: [PlainText::ParseRule] (PlainText::ParseRule::RuleConsecutiveLbs)
-    # @option name: [String, Symbol, Integer, nil] Identifier of rule, if need to specify.
-    # @option range: [Range, nil] Range of indices of self to reparse. In Default, the entire self.
+    # @option rule [PlainText::ParseRule] (PlainText::ParseRule::RuleConsecutiveLbs)
+    # @option name [String, Symbol, Integer, nil] Identifier of rule, if need to specify.
+    # @option range [Range, nil] Range of indices of self to reparse. In Default, the entire self.
     # @return [self]
     def reparse!(rule: PlainText::ParseRule::RuleConsecutiveLbs, name: nil, range: (0..-1))
       insert range.begin, self.class.parse((range ? self[range] : self), rule: rule, name: name)
@@ -825,7 +824,7 @@ $myd = false
     #
     # @see #insert
     #
-    # @param *rest [Array<Array>]
+    # @param rest [Array<Array>]
     # @return [self]
     def concat(*rest)
       insert(size, *(rest.sum([])))
@@ -835,7 +834,7 @@ $myd = false
     #
     # @see #concat
     #
-    # @param ary [Array]
+    # @param rest [Array]
     # @return [self]
     def push(*rest)
       concat(rest)

data/lib/plain_text/split.rb CHANGED Viewed

@@ -52,8 +52,8 @@ module PlainText
     #
     # @param instr [String] String that is examined.
     # @param re_in [Regexp, String] If String, it is interpreted literally as in String#split.
-    # @param like_linenum: [Boolean] if true (Def: false), it counts like the line number.
-    # @param with_if_end: [Boolean] a special case (see the description).
+    # @param like_linenum [Boolean] if true (Def: false), it counts like the line number.
+    # @param with_if_end [Boolean] a special case (see the description).
     # @return [Integer] always positive
     # @see PlainText::Split#count_regexp
     def self.count_regexp(instr, re_in, like_linenum: false, with_if_end: false)
@@ -72,7 +72,7 @@ module PlainText
     # One more parameter (input String) is required to specify.
     #
     # @param instr [String] String that is examined.
-    # @param linebreak: [String] +\n+ etc (Default: $/).
+    # @param linebreak [String] +\n+ etc (Default: $/).
     # @return [Integer] always positive
     # @see #count_lines
     def self.count_lines(instr, linebreak: $/)
@@ -124,7 +124,7 @@ module PlainText
     #   s.split_with_delimiter(/X+(Q?)/)
     #                           #=> ["", "XQ", "ab", "XX", "c", "XQ"]
     #
-    # @param re_in [Regexp, String] If String, it is interpreted literally as in String#split.
+    # @param rest [Regexp, String] If String, it is interpreted literally as in String#split.
     # @return [Array]
     def split_with_delimiter(*rest)
       PlainText::Split.public_send(__method__, self, *rest)
@@ -150,9 +150,10 @@ module PlainText
     # (This parameter is introduced just to reduce the overhead of
     # potentially calling this routine twice or user's making their own check.)
     #
-    # @param re_in [Regexp, String] If String, it is interpreted literally as in String#split.
-    # @param like_linenum: [Boolean] if true (Def: false), it counts like the line number.
-    # @param with_if_end: [Boolean] a special case (see the description).
+    # @param rest [Regexp, String] re_in: If String, it is interpreted literally as in String#split.
+    # @param kwd [Hash<like_linenum: Boolean, with_if_end: Boolean>]
+    #    if like_linenum: true (Def: false), it counts like the line number.
+    #    with_if_end: a special case (see the description).
     # @return [Integer, Array<Integer, Boolean>] always positive
     # @see PlainText::Split#count_regexp
     def count_regexp(*rest, **kwd)
@@ -161,7 +162,7 @@ module PlainText
     # Returns the number of lines.
     #
-    # @param linebreak: [String] +\n+ etc (Default: $/).
+    # @param kwd [Hash<linebreak: String>] +\n+ etc (Default: $/).
     # @return [Integer] always positive
     # @see PlainText::Split#count_regexp
     def count_lines(**kwd)

data/lib/plain_text/util.rb CHANGED Viewed

@@ -44,8 +44,8 @@ module PlainText
     #    # => [[33, 55], [44, ""]]
     #
     # @param ary [Array]
-    # @param size_even: [Boolean] if true (Def: false), the sizes of the returned arrays are guaranteed to be identical.
-    # @param filler: [Object] if size_even: is true and if matching is performed, this filler is added at the end of the last element.
+    # @param size_even [Boolean] if true (Def: false), the sizes of the returned arrays are guaranteed to be identical.
+    # @param filler [Object] if size_even: is true and if matching is performed, this filler is added at the end of the last element.
     def even_odd_arrays(ary, size_even: false, filler: "")
       ar_even = select.with_index { |_, i| i.even? } rescue select.each_with_index { |_, i| i.even? } # Rescue for Ruby 2.1 or earlier
       ar_odd  = select.with_index { |_, i| i.odd? }  rescue select.each_with_index { |_, i| i.odd? }  # Rescue for Ruby 2.1 or earlier
@@ -83,8 +83,8 @@ module PlainText
     #
     # @param index_in [Integer] Index to check and convert from. Potentially negative integer.
     # @param ary [Array, Integer, nil] Reference Array or its size (Array#size) or nil (interpreted as self#size (untested)).
-    # @param accept_too_big: [Boolean, NilClass] if true (Default), a positive index larger than the last array index is returned as it is. If nil, the last index + 1 is accepted but raises an Exception for anything larger.  If false, any index larger than the last index raises an Exception.
-    # @param varname: [NilClass, String] Name of the variable (or nil) to be used for error messages.
+    # @param accept_too_big [Boolean, NilClass] if true (Default), a positive index larger than the last array index is returned as it is. If nil, the last index + 1 is accepted but raises an Exception for anything larger.  If false, any index larger than the last index raises an Exception.
+    # @param varname [NilClass, String] Name of the variable (or nil) to be used for error messages.
     # @return [Integer] Non-negative index; i.e., if index=-1 is specified for an Array with a size of 3, the returned value is 2 (the last index of it).
     # @raise [IndexError] if the index is out of the range to negative.
     # @raise [ArgumentError] if ary is neither an Array nor Integer, or more specifically, it does not have size method or ary.size does not return Integer or similar.
@@ -115,8 +115,8 @@ module PlainText
     #
     # @param from [Array, Range]
     # @param arref [Array, Integer] Reference Array or its size (Array#size) or nil (interpreted as self#size).
-    # @param flatten: [Boolean] If true (Default), if elements are Range, they are unfolded.  If false and if an Array containing a Range, Exception is raised.
-    # @param sortuniq: [Boolean] If true (Default), the return is sorted and uniq-ed.
+    # @param flatten [Boolean] If true (Default), if elements are Range, they are unfolded.  If false and if an Array containing a Range, Exception is raised.
+    # @param sortuniq [Boolean] If true (Default), the return is sorted and uniq-ed.
     # @return [Array, nil] nil if arref is empty or if out of range to the negative.  Note in most cases in Ruby default, it raises IndexError.  See the code of {#positive_array_index_checked}
     # @raise [TypeError] if non-integer is specified.
     # @raise [ArgumentError] if arref is neither an Array nor Integer, or more specifically, it does not have size method or arref.size does not return Integer or similar.

data/lib/plain_text.rb CHANGED Viewed

@@ -131,18 +131,18 @@ module PlainText
   #   /(\A[[:blank:]]+|\n[[:space:]]+)/
   #
   # @param prt [PlainText:Part, String] {Part} or String to examine.
-  # @param preserve_paragraph: [Boolean] Paragraphs are taken into account if true (Def: False). In the input, paragraphs are defined to be separated with more than one +lb+ with potentially some space characters in between. Their output style is specified with +boundary_style+.
-  # @param boundary_style: [String, Symbol] One of +(:truncate|:truncate2|:delete|:none)+ or String. If String, the boundaries between paragraphs are replaced with this String (Def: +lb_out*2+).  If +:truncate+, consecutive linebreaks and spaces are truncated into 2 linebreaks.   +:truncate2+ are similar, but they are not truncated beyond 3 linebreaks (ie., up to 2 blank lines between Paragraphs). If +:none+, nothing is done about them. Unless :none, all the white spaces between linebreaks are deleted.
-  # @param lbs_style: [Symbol] One of +(:truncate|:delete|:none)+ (Def: +:truncate+).  If :delete, all the linebreaks within paragraphs are deleted.  +:truncate+ is meaningful only when +preserve_paragraph=false+ and consecutive linebreaks are truncated into 1 linebreak.
-  # @param sps_style: [Symbol] One of +(:truncate|:delete|:none)+ (Def: +:truncate+).  If +:truncate+, the consecutive white spaces within paragraphs, *except* for those at the line-head or line-tail (which are controlled by +linehead_style+ and +linehead_style+, respectively), are truncated into a single white space. If :delete, they are deleted.
-  # @param lb_is_space: [Boolean] If true, a line-break, except those for the boundaries (unless +preserve_paragraph+ is false), is equivalent to a space (Def: False).
-  # @param delete_asian_space: [Boolean] Any spaces between, before, after Asian characters (but punctuation) are deleted, if true (Default).
-  # @param linehead_style: [Symbol] One of +(:truncate|:delete|:none)+ (Def: :none). Determine how to handle consecutive white spaces at the beggining of each line.
-  # @param linetail_style: [Symbol] One of +(:truncate|:delete|:markdown|:none)+ (Def: :delete). Determine how to handle consecutive white spaces at the end of each line.  If +:markdown, 1 space is always deleted, and two or more spaces are truncated into two ASCII whitespaces *if* the last two spaces are ASCII whitespaces, or else untouched.
-  # @param firstlbs_style: [Symbol, String] One of +(:truncate|:delete|:none)+ or String (Def: :default). If +:truncate+, any linebreaks at the very beginning of self (and whitespaces in between), if exist, are truncated to a single linebreak.  If String, they are, even if not exists, replaced with the specified String (such as a linebreak).  If +:delete+, they are deleted.  Note This option has nothing to do with the whitespaces at the beginning of the first significant line (hence the name of the option).  Note if a (random) Part is given, this option only considers the first significant element of it.
-  # @param lastsps_style: [Symbol, String] One of +(:truncate|:delete|:none|:linebreak)+ or String (Def: :truncate). If +:truncate+, any of linebreaks *AND* white spaces at the tail of self, if exist, are truncated to a single linebreak.  If +:delete+, they are deleted.  If String, they are, even if not exists, replaced with the specified String (such as a linebreak, in which case +lb_out+ is used as String, i.e., it guarantees only 1 linebreak to exist at the end of the String).  Note if a (random) Part is given, this option only considers the last significant element of it.
-  # @param lb: [String] Linebreak character like +\n+ etc (Default: $/). If this is one of the standard line-breaks, irregular line-breaks (for example, existence of CR when only LF should be there) are corrected.
-  # @param lb_out: [String] Linebreak used for output (Default: +lb+)
+  # @param preserve_paragraph [Boolean] Paragraphs are taken into account if true (Def: False). In the input, paragraphs are defined to be separated with more than one +lb+ with potentially some space characters in between. Their output style is specified with +boundary_style+.
+  # @param boundary_style [String, Symbol] One of +(:truncate|:truncate2|:delete|:none)+ or String. If String, the boundaries between paragraphs are replaced with this String (Def: +lb_out*2+).  If +:truncate+, consecutive linebreaks and spaces are truncated into 2 linebreaks.   +:truncate2+ are similar, but they are not truncated beyond 3 linebreaks (ie., up to 2 blank lines between Paragraphs). If +:none+, nothing is done about them. Unless :none, all the white spaces between linebreaks are deleted.
+  # @param lbs_style [Symbol] One of +(:truncate|:delete|:none)+ (Def: +:truncate+).  If :delete, all the linebreaks within paragraphs are deleted.  +:truncate+ is meaningful only when +preserve_paragraph=false+ and consecutive linebreaks are truncated into 1 linebreak.
+  # @param sps_style [Symbol] One of +(:truncate|:delete|:none)+ (Def: +:truncate+).  If +:truncate+, the consecutive white spaces within paragraphs, *except* for those at the line-head or line-tail (which are controlled by +linehead_style+ and +linehead_style+, respectively), are truncated into a single white space. If :delete, they are deleted.
+  # @param lb_is_space [Boolean] If true, a line-break, except those for the boundaries (unless +preserve_paragraph+ is false), is equivalent to a space (Def: False).
+  # @param delete_asian_space [Boolean] Any spaces between, before, after Asian characters (but punctuation) are deleted, if true (Default).
+  # @param linehead_style [Symbol] One of +(:truncate|:delete|:none)+ (Def: :none). Determine how to handle consecutive white spaces at the beggining of each line.
+  # @param linetail_style [Symbol] One of +(:truncate|:delete|:markdown|:none)+ (Def: :delete). Determine how to handle consecutive white spaces at the end of each line.  If +:markdown, 1 space is always deleted, and two or more spaces are truncated into two ASCII whitespaces *if* the last two spaces are ASCII whitespaces, or else untouched.
+  # @param firstlbs_style [Symbol, String] One of +(:truncate|:delete|:none)+ or String (Def: :default). If +:truncate+, any linebreaks at the very beginning of self (and whitespaces in between), if exist, are truncated to a single linebreak.  If String, they are, even if not exists, replaced with the specified String (such as a linebreak).  If +:delete+, they are deleted.  Note This option has nothing to do with the whitespaces at the beginning of the first significant line (hence the name of the option).  Note if a (random) Part is given, this option only considers the first significant element of it.
+  # @param lastsps_style [Symbol, String] One of +(:truncate|:delete|:none|:linebreak)+ or String (Def: :truncate). If +:truncate+, any of linebreaks *AND* white spaces at the tail of self, if exist, are truncated to a single linebreak.  If +:delete+, they are deleted.  If String, they are, even if not exists, replaced with the specified String (such as a linebreak, in which case +lb_out+ is used as String, i.e., it guarantees only 1 linebreak to exist at the end of the String).  Note if a (random) Part is given, this option only considers the last significant element of it.
+  # @param lb [String] Linebreak character like +\n+ etc (Default: $/). If this is one of the standard line-breaks, irregular line-breaks (for example, existence of CR when only LF should be there) are corrected.
+  # @param lb_out [String] Linebreak used for output (Default: +lb+)
   # @return same as prt
   #
   def self.clean_text(
@@ -587,9 +587,9 @@ module PlainText
   # if num is +/ABC/+ (Regexp), String of the lines from the beginning up to the line that contains the character +"ABC"+ is returned.
   #
   # @param num_in [Integer, Regexp] Number (positive or negative, but not 0) of :unit to extract (Def: 10), or Regexp, which is valid only if unit is :line.
-  # @param unit: [Symbol, String] One of +:line+ (or +"-n"+), :+char+, +:byte+ (or +"-c"+)
-  # @param inclusive: [Boolean] read only when unit is :line. If inclusive (Default), the (entire) line that matches is included in the result.
-  # @param linebreak: [String] +\n+ etc (Default: +$/+), used when +unit==:line+ (Default)
+  # @param unit [Symbol, String] One of +:line+ (or +"-n"+), :+char+, +:byte+ (or +"-c"+)
+  # @param inclusive [Boolean] read only when unit is :line. If inclusive (Default), the (entire) line that matches is included in the result.
+  # @param linebreak [String] +\n+ etc (Default: +$/+), used when +unit==:line+ (Default)
   # @return [String] as self
   def head(num_in=DEF_HEADTAIL_N_LINES, unit: :line, inclusive: true, padding: 0, linebreak: $/)
     if num_in.class.method_defined? :to_int
@@ -670,10 +670,10 @@ module PlainText
   # String#strip! for each line
   #
-  # @param strip_head: [Boolean] if true (Default), spaces at each line head are removed.
-  # @param strip_tail: [Boolean] if true (Default), spaces at each line tail are removed (see +markdown+ option).
-  # @param markdown: [Boolean] if true (Def: false), a double space at each tail remains and +strip_head+ is forcibly false.
-  # @param linebreak: [String] +\n+ etc (Default: +$/+)
+  # @param strip_head [Boolean] if true (Default), spaces at each line head are removed.
+  # @param strip_tail [Boolean] if true (Default), spaces at each line tail are removed (see +markdown+ option).
+  # @param markdown [Boolean] if true (Def: false), a double space at each tail remains and +strip_head+ is forcibly false.
+  # @param linebreak [String] +\n+ etc (Default: +$/+)
   # @return [self, NilClass] nil if gsub! does not match at all, i.e., there are no spaces to remove.
   def strip_at_lines!(strip_head: true, strip_tail: true, markdown: false, linebreak: $/)
     strip_head = false if markdown
@@ -695,7 +695,7 @@ module PlainText
   # String#strip! for each line but only for the head part (NOT tail part)
   #
-  # @param linebreak: [String] "\n" etc (Default: $/)
+  # @param linebreak [String] "\n" etc (Default: $/)
   # @return [self, NilClass] nil if gsub! does not match at all, i.e., there are no spaces to remove.
   def strip_at_lines_head!(linebreak: $/)
     lb_quo = Regexp.quote linebreak
@@ -714,8 +714,8 @@ module PlainText
   # String#strip! for each line but only for the tail part (NOT head part)
   #
-  # @param markdown: [Boolean] if true (Def: false), a double space at each tail remains.
-  # @param linebreak: [String] "\n" etc (Default: $/)
+  # @param markdown [Boolean] if true (Def: false), a double space at each tail remains.
+  # @param linebreak [String] "\n" etc (Default: $/)
   # @return [self, NilClass] nil if gsub! does not match at all, i.e., there are no spaces to remove.
   def strip_at_lines_tail!(markdown: false, linebreak: $/)
     lb_quo = Regexp.quote linebreak
@@ -775,9 +775,9 @@ module PlainText
   # *all the lines from Line 1* would be included, which is most likely not what the caller wants.
   #
   # @param num_in [Integer, Regexp] Number (positive or negative, but not 0) of :unit to extract (Def: 10), or Regexp, which is valid only if unit is :line.  If positive, the last num_in lines are returned.  If negative, the lines from the num-in-th line from the head are returned. In short, calling this method as +tail(3)+ and +tail(-3)+ is similar to the UNIX commands "tail -n 3" and "tail -n +3", respectively.
-  # @param unit: [Symbol] One of :line (as in -n option), :char, :byte (-c option)
-  # @param inclusive: [Boolean] read only when unit is :line. If inclusive (Default), the (entire) line that matches is included in the result.
-  # @param linebreak: [String] +\n+ etc (Default: +$/+), used when unit==:line (Default)
+  # @param unit [Symbol] One of :line (as in -n option), :char, :byte (-c option)
+  # @param inclusive [Boolean] read only when unit is :line. If inclusive (Default), the (entire) line that matches is included in the result.
+  # @param linebreak [String] +\n+ etc (Default: +$/+), used when unit==:line (Default)
   # @return [String] as self
   def tail(num_in=DEF_HEADTAIL_N_LINES, unit: :line, inclusive: true, padding: 0, linebreak: $/)
@@ -836,9 +836,9 @@ module PlainText
   # @todo Improve the algorithm like {#tail_regexp}
   #
   # @param re_in [Regexp] Regexp to determine the boundary.
-  # @param inclusive: [Boolean] If true (Default), the (entire) line that matches re_in is included in the result. Else the entire line is excluded.
-  # @param padding: [Integer] Add (postive/negative) the number of lines returned.
-  # @param linebreak: [String] +\n+ etc (Default: $/).
+  # @param inclusive [Boolean] If true (Default), the (entire) line that matches re_in is included in the result. Else the entire line is excluded.
+  # @param padding [Integer] Add (postive/negative) the number of lines returned.
+  # @param linebreak [String] +\n+ etc (Default: $/).
   # @return [String] as self
   # @see #head
   def head_regexp(re_in, inclusive: true, padding: 0, linebreak: $/)
@@ -899,7 +899,7 @@ module PlainText
   #   pre_match_in_line(      "__abc")  # => #<MatchData "__abc"> pre_match=="     "
   #
   # @param strpre [String] String of prematch of the last MatchData
-  # @param linebreak: [String] +\n+ etc (Default: $/)
+  # @param linebreak [String] +\n+ etc (Default: $/)
   # @return [MatchData] m[0] is the string after the last linebreak before the matched data (exclusive) and m.pre_match is all the lines before that.
   def pre_match_in_line(strpre, linebreak: $/)
     lb_quo = Regexp.quote linebreak
@@ -918,7 +918,7 @@ module PlainText
   #
   # @param mat [MatchData, String] If String, it is User's (last) matched String.
   # @param strpre [String, nil] Pre-match from the beginning of self to the mathced string, if mat is String.
-  # @param linebreak: [String] +\n+ etc (Default: $/)
+  # @param linebreak [String] +\n+ etc (Default: $/)
   # @return [Hash<Integer, nil>] 4 keys: :last_prematch, :first_matched, :last_matched, :first_post_match
   def _matched_line_indices(mat, strpre=nil, linebreak: $/)
     if mat.class.method_defined? :post_match
@@ -966,7 +966,7 @@ module PlainText
   #
   # @param mat [MatchData, String] If String, it is User's (last) matched String.
   # @param strpost [String, nil] Post-match, if mat is String.  After User's last match.
-  # @param linebreak: [String] +\n+ etc (Default: $/)
+  # @param linebreak [String] +\n+ etc (Default: $/)
   # @return [MatchData] m[0] is the string after matched data and up to the next first linebreak (inclusive) (or empty string if the last character(s) of matched data is the linebreak) and m.post_match is all the lines after that.  (maybe nil?? not sure...)
   def post_match_in_line(mat, strpost=nil, linebreak: $/)
     lb_quo = Regexp.quote linebreak
@@ -994,8 +994,8 @@ module PlainText
   # 6. pass it to {#head_inverse} (after Line-1).
   #
   # @param re_in [Regexp] Regexp to determine the boundary.
-  # @param inclusive: [Boolean] If true (Default), the (entire) line that matches re_in is included in the result. Else the entire line is excluded.
-  # @param linebreak: [String] +\n+ etc (Default: $/).
+  # @param inclusive [Boolean] If true (Default), the (entire) line that matches re_in is included in the result. Else the entire line is excluded.
+  # @param linebreak [String] +\n+ etc (Default: $/).
   # @return [String] as self
   # @see #tail
   def tail_regexp(re_in, inclusive: true, padding: 0, linebreak: $/)
@@ -1030,7 +1030,7 @@ module PlainText
   #
   # @param num_in [Integer] Original argument of the specified number of lines
   # @param num [Integer] Converted integer for num_in
-  # @param linebreak: [String] +\n+ etc (Default: $/).
+  # @param linebreak [String] +\n+ etc (Default: $/).
   # @return [String] as self
   # @see #tail
   def tail_linenum(num_in, num, linebreak: $/)

data/plain_text.gemspec CHANGED Viewed

@@ -5,7 +5,7 @@ require 'date'
 Gem::Specification.new do |s|
   s.name = %q{plain_text}.sub(/.*/){|c| (c == File.basename(Dir.pwd)) ? c : raise("ERROR: s.name=(#{c}) in gemspec seems wrong!")}
-  s.version = "0.6".sub(/.*/){|c| fs = Dir.glob('changelog{,.*}', File::FNM_CASEFOLD); raise('More than one ChangeLog exist!') if fs.size > 1; warn("WARNING: Version(s.version=#{c}) already exists in #{fs[0]} - ok?") if fs.size == 1 && !IO.readlines(fs[0]).grep(/^\(Version: #{Regexp.quote c}\)$/).empty? ; c }  # n.b., In macOS, changelog and ChangeLog are identical in default.
+  s.version = "0.7".sub(/.*/){|c| fs = Dir.glob('changelog{,.*}', File::FNM_CASEFOLD); raise('More than one ChangeLog exist!') if fs.size > 1; warn("WARNING: Version(s.version=#{c}) already exists in #{fs[0]} - ok?") if fs.size == 1 && !IO.readlines(fs[0]).grep(/^\(Version: #{Regexp.quote c}\)$/).empty? ; c }  # n.b., In macOS, changelog and ChangeLog are identical in default.
   # s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
   s.bindir = 'bin'
   %w(countchar textclean head.rb tail.rb yard2md_afterclean).each do |f|
@@ -13,12 +13,12 @@ Gem::Specification.new do |s|
     File.executable?(path) ? s.executables << f : raise("ERROR: Executable (#{path}) is not executable!")
   end
   s.authors = ["Masa Sakano"]
-  s.date = %q{2019-11-07}.sub(/.*/){|c| (Date.parse(c) == Date.today) ? c : raise("ERROR: s.date=(#{c}) is not today!")}
+  s.date = %q{2022-08-25}.sub(/.*/){|c| (Date.parse(c) == Date.today) ? c : raise("ERROR: s.date=(#{c}) is not today!")}
   s.summary = %q{Module to handle Plain-Text}
   s.description = %q{This module provides utility functions and methods to handle plain text, classes Part/Paragraph/Boundary to represent the logical structure of a document and ParseRule to describe the rules to parse plain text to produce a Part-type Ruby instance. A few handy Ruby executable scripts to make use of them are included.}
   # s.email = %q{abc@example.com}
   s.extra_rdoc_files = [
-    # "LICENSE",
+    # "LICENSE.txt",
      "README.en.rdoc",
   ]
   s.license = 'MIT'

data/test/testyard2md_afterclean.rb CHANGED Viewed

@@ -58,14 +58,50 @@ class TestUnitYard2mdRb < MiniTest::Test
     assert_equal exp, o, "期待:#{exp.inspect} ⇔ \n実際:#{o.inspect}"
     assert_empty e
-    stin = "    +abc def+ " + "\n\n\n efg\n"
+    stin = "    abc def " + "\n\n\n efg\n"
     srub = "```ruby\n"
-    exp = srub+"+abc def+ \n```\n\n\n efg\n"
+    exp = srub+"abc def \n```\n\n\n efg\n"
     o, e, s = Open3.capture3 EXE, stdin_data: stin
     assert_equal 0, s.exitstatus
     assert_equal exp, o, "期待:#{exp.inspect} ⇔ \n実際:#{o.inspect}"
     assert_empty e
+    # automated judge: sh
+    stin = "    % abc def " + "\n\n\n efg\n"
+    srub = "```sh\n"
+    exp = srub+"% abc def \n```\n\n\n efg\n"
+    o, e, s = Open3.capture3 EXE, stdin_data: stin
+    assert_equal 0, s.exitstatus
+    assert_equal exp, o, "期待:#{exp.inspect} ⇔ \n実際:#{o.inspect}"
+    assert_empty e
+    # automated judge unchanged: tex
+    stin = "    % abc def " + "\n\n\n efg\n"
+    srub = "```tex\n"
+    exp = srub+"% abc def \n```\n\n\n efg\n"
+    o, e, s = Open3.capture3 EXE+" --lang=tex", stdin_data: stin
+    assert_equal 0, s.exitstatus
+    assert_equal exp, o, "期待:#{exp.inspect} ⇔ \n実際:#{o.inspect}"
+    assert_empty e
+    # automated judge: html
+    stin = "    <abc>def " + "\n\n\n efg\n"
+    srub = "```html\n"
+    exp = srub+"<abc>def \n```\n\n\n efg\n"
+    o, e, s = Open3.capture3 EXE, stdin_data: stin
+    assert_equal 0, s.exitstatus
+    assert_equal exp, o, "期待:#{exp.inspect} ⇔ \n実際:#{o.inspect}"
+    assert_empty e
+    # automated judge unchanged: javascript
+    stin = "    <abc>def " + "\n\n\n efg\n"
+    srub = "```javascript\n"
+    exp = srub+"<abc>def \n```\n\n\n efg\n"
+    o, e, s = Open3.capture3 EXE+" --lang=javascript", stdin_data: stin
+    assert_equal 0, s.exitstatus
+    assert_equal exp, o, "期待:#{exp.inspect} ⇔ \n実際:#{o.inspect}"
+    assert_empty e
   end
 end

metadata CHANGED Viewed

@@ -1,20 +1,20 @@
 --- !ruby/object:Gem::Specification
 name: plain_text
 version: !ruby/object:Gem::Version
-  version: '0.6'
+  version: '0.7'
 platform: ruby
 authors:
 - Masa Sakano
-autorequire:
+autorequire:
 bindir: bin
 cert_chain: []
-date: 2019-11-07 00:00:00.000000000 Z
+date: 2022-08-25 00:00:00.000000000 Z
 dependencies: []
 description: This module provides utility functions and methods to handle plain text,
   classes Part/Paragraph/Boundary to represent the logical structure of a document
   and ParseRule to describe the rules to parse plain text to produce a Part-type Ruby
   instance. A few handy Ruby executable scripts to make use of them are included.
-email:
+email:
 executables:
 - countchar
 - textclean
@@ -59,7 +59,7 @@ licenses:
 - MIT
 metadata:
   yard.run: yri
-post_install_message:
+post_install_message:
 rdoc_options:
 - "--charset=UTF-8"
 require_paths:
@@ -75,18 +75,18 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.0.3
-signing_key:
+rubygems_version: 3.3.7
+signing_key:
 specification_version: 4
 summary: Module to handle Plain-Text
 test_files:
+- test/test_plain_text.rb
 - test/test_plain_text_parse_rule.rb
-- test/testtail_rb.rb
 - test/test_plain_text_part.rb
-- test/test_plain_text.rb
-- test/testyard2md_afterclean.rb
-- test/testcountchar.rb
-- test/testtextclean.rb
 - test/test_plain_text_split.rb
 - test/test_plain_text_util.rb
+- test/testcountchar.rb
 - test/testhead_rb.rb
+- test/testtail_rb.rb
+- test/testtextclean.rb
+- test/testyard2md_afterclean.rb