charset_move 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/COPYING ADDED
@@ -0,0 +1,24 @@
1
+ # Copyright and Copying Information
2
+
3
+ CharsetMove is Copyright 2012 Chad Perrin. It may be distributed under the
4
+ terms of the Nietzsche Public License. See LICENSE file for details.
5
+
6
+ ## Dependencies
7
+
8
+ CharsetMove depends only on a standard Ruby runtime and its standard library.
9
+ The reference implementation of Ruby is MRI/YARV, disjunctively dual-licensed
10
+ under the terms of the Ruby License and the Simplified BSD License. The same
11
+ licensing terms apply to the standard library.
12
+
13
+ ## Licensing Philosophy
14
+
15
+ The intent in selecting the Nietzsche Public License (or NPL) is to keep
16
+ licensing simple, and to inject a modicum of philosophical humor into the
17
+ otherwise dry and serious matter of copyright licensing. The NPL is a
18
+ [copyfree](http://copyfree.org) license which, like many copyfree licenses, is
19
+ [usage optimized](http://univacc.net/?page=license_simplicity) -- that is,
20
+ designed to minimize friction imposed on the development and distribution
21
+ process by legal issues. The Simplified BSD License offered as one of two
22
+ licensing options for MRI/YARV is another simple copyfree license. The goal is
23
+ to ensure that you can use the software however you like with a minimum of
24
+ fuss, risk, and bureaucratic overhead.
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ Nietzsche Public License v0.5
2
+
3
+ Copyright 2012 Chad Perrin
4
+
5
+ Copyright, like God, is dead. Let its corpse serve only to guard against its
6
+ resurrection. You may do anything with this work that copyright law would
7
+ normally restrict, so long as you retain the above notice(s), this license, and
8
+ the following misquote and disclaimer of warranty with all redistributed copies
9
+ and derived works. You may also replace this license with the Open Works
10
+ License, available at the http://owl.apotheon.org website.
11
+
12
+ Copyright is dead. Copyright remains dead, and we have killed it. How
13
+ shall we comfort ourselves, the murderers of all murderers? What was
14
+ holiest and mightiest of all that the world of censorship has yet owned has
15
+ bled to death under our knives: who will wipe this blood off us? What
16
+ water is there for us to clean ourselves? What festivals of atonement,
17
+ what sacred games shall we have to invent? Is not the greatness of this
18
+ deed too great for us? Must we ourselves not become authors simply to
19
+ appear worthy of it?
20
+ - apologies to Friedrich Wilhelm Nietzsche
21
+
22
+ This license implies no warranty.
data/README.md ADDED
@@ -0,0 +1,42 @@
1
+ # CharsetMove
2
+
3
+ A trivial series of coincidences led to me wanting to create a simple, Ruby
4
+ based, copyfree licensed tool for changing the encodings used for filenames.
5
+
6
+ ## cmv
7
+
8
+ The executable command line utility `cmv` is simple enough to use. To change
9
+ the names of all files in the current directory from UTF-8 to 8-bit ASCII,
10
+ without any chance of a filename overwriting an existing file (including
11
+ possibly multiple files being written to the same filename after transcoding
12
+ filenames):
13
+
14
+ cmv -nf UTF-8 -t ACII-8BIT *
15
+
16
+ The `-n` option is "noclobber" (thus using numbers at the ends of filenames to
17
+ avoid files overwriting each other; default behavior simply overwrites existing
18
+ files, just like `mv` would on standard Unix systems), the `-f` is "from" (the
19
+ source encoding), and `-t` is "to" (the target encoding). The program tries to
20
+ set sensible defaults for from and to, but they will both probably end up with
21
+ the same encoding if you do not specify an encoding for at least one of the
22
+ two.
23
+
24
+ If you are not sure how to specify an encoding, a list of supported encodings
25
+ can be displayed with the `-l` option. It is likely to be long; you may want
26
+ to pipe the output to a pager such as `less`, or narrow down the displayed
27
+ output by piping it to a `grep` command.
28
+
29
+ Use the `--help` or `-h` option to learn more about how to use `cmv`.
30
+
31
+ ## installing
32
+
33
+ For now, just copying the `cmv` file into your execution path somewhere should
34
+ suffice for "installation". At some point, it will probably be turned into a
35
+ really minimal gem so you can use a `gem install` command.
36
+
37
+ ## credits
38
+
39
+ This program was written by Chad Perrin, Copyright 2012. It can be
40
+ redistributed under the terms of the Nietzsche Public License for now (see
41
+ LICENSE file). This may change at some point -- probably to the Open Works
42
+ License, which the NPL allows you to use in place of the NPL itself anyway.
data/bin/cmv ADDED
@@ -0,0 +1,104 @@
1
+ #!/usr/bin/env ruby
2
+ require 'optparse'
3
+ require 'charset_move'
4
+
5
+ =begin
6
+
7
+ Nietzsche Public License v0.5
8
+
9
+ Copyright 2012 Chad Perrin
10
+
11
+ Copyright, like God, is dead. Let its corpse serve only to guard against its
12
+ resurrection. You may do anything with this work that copyright law would
13
+ normally restrict, so long as you retain the above notice(s), this license, and
14
+ the following misquote and disclaimer of warranty with all redistributed copies
15
+ and derived works. You may also replace this license with the Open Works
16
+ License, available at the http://owl.apotheon.org website.
17
+
18
+ Copyright is dead. Copyright remains dead, and we have killed it. How
19
+ shall we comfort ourselves, the murderers of all murderers? What was
20
+ holiest and mightiest of all that the world of censorship has yet owned has
21
+ bled to death under our knives: who will wipe this blood off us? What
22
+ water is there for us to clean ourselves? What festivals of atonement,
23
+ what sacred games shall we have to invent? Is not the greatness of this
24
+ deed too great for us? Must we ourselves not become authors simply to
25
+ appear worthy of it?
26
+ - apologies to Friedrich Wilhelm Nietzsche
27
+
28
+ This license implies no warranty.
29
+
30
+ =end
31
+
32
+ help_text = {
33
+ :from => 'Specify a source encoding.',
34
+ :to => 'Specify a destination encoding.',
35
+ :noclobber => 'Do not clobber existing files.',
36
+ :list => 'Display supported encodings.',
37
+ :help => 'Display this help text.',
38
+ :version => 'Display version and license information.'
39
+ }
40
+
41
+ @usage = <<EOF
42
+
43
+ USAGE: #{File.basename $0} [options] FILE [FILE [. . .]]
44
+
45
+ EOF
46
+
47
+ version_help = <<EOF
48
+
49
+ CharsetMove #{CharsetMove.version}, Copyright 2012 Chad Perrin
50
+ May be distributed under the terms of the Nietzsche Public License.
51
+
52
+ EOF
53
+
54
+
55
+ @options = {
56
+ :from => Encoding.default_external,
57
+ :to => Encoding.default_external,
58
+ :noclobber => false
59
+ }
60
+
61
+ OptionParser.new do |opts|
62
+ opts.banner = @usage
63
+
64
+ opts.on('--from=ENCODING', '-f', help_text[:from]) do |arg|
65
+ @options[:from] = arg
66
+ end
67
+
68
+ opts.on('--to=ENCODING', '-t', help_text[:to]) do |arg|
69
+ @options[:to] = arg
70
+ end
71
+
72
+ opts.on('--noclobber', '-n', help_text[:noclobber]) do
73
+ @options[:noclobber] = true
74
+ end
75
+
76
+ opts.on('--list', '-l', help_text[:list]) do
77
+ puts CharsetMove.sorted_encodings
78
+ exit(0)
79
+ end
80
+
81
+ opts.on('--help', '-h', help_text[:help]) do
82
+ puts opts
83
+ puts
84
+ exit(0)
85
+ end
86
+
87
+ opts.on_tail('--version', help_text[:version]) do
88
+ puts version_help
89
+ exit(0)
90
+ end
91
+ end.parse!
92
+
93
+
94
+
95
+ if 0 < ARGV.size
96
+ cmv = CharsetMove.new(@options)
97
+ ARGV.each do |old_filename|
98
+ puts cmv.transcode_filename(old_filename)
99
+ end
100
+ else
101
+ puts @usage
102
+ puts %Q{ Try "#{File.basename $0} --help" for usage information.}
103
+ puts
104
+ end
@@ -0,0 +1,162 @@
1
+ require 'fileutils'
2
+
3
+ =begin rdoc
4
+
5
+ CharsetMove provides a Ruby class as a simple wrapper around Ruby's
6
+ Encoding::Converter functionality, designed for easy use by a command line
7
+ utility.
8
+
9
+ === API
10
+
11
+ require 'charset_move'
12
+
13
+ config = {
14
+ :from => 'UTF-8',
15
+ :to => 'ASCII-8BIT',
16
+ :noclobber => true
17
+ }
18
+
19
+ cmv = CharsetMove(config)
20
+
21
+ # filenames listed in ASCIIbetical order
22
+ ['file_öne', 'file_threë', 'file_twó', 'file_twö'].each do |fname|
23
+ puts cmv.transcode_filename(fname)
24
+ end
25
+
26
+ The above code produces this command line output:
27
+
28
+ file_öne -> file_?ne
29
+ file_threë -> file_thre?
30
+ file_twó -> file_tw?
31
+ file_twö -> file_tw?1
32
+
33
+ === Use and Sharing
34
+
35
+ Nietzsche Public License v0.5
36
+
37
+ Copyright 2012 Chad Perrin
38
+
39
+ Copyright, like God, is dead. Let its corpse serve only to guard against its
40
+ resurrection. You may do anything with this work that copyright law would
41
+ normally restrict, so long as you retain the above notice(s), this license, and
42
+ the following misquote and disclaimer of warranty with all redistributed copies
43
+ and derived works. You may also replace this license with the Open Works
44
+ License, available at the http://owl.apotheon.org website.
45
+
46
+ Copyright is dead. Copyright remains dead, and we have killed it. How
47
+ shall we comfort ourselves, the murderers of all murderers? What was
48
+ holiest and mightiest of all that the world of censorship has yet owned has
49
+ bled to death under our knives: who will wipe this blood off us? What
50
+ water is there for us to clean ourselves? What festivals of atonement,
51
+ what sacred games shall we have to invent? Is not the greatness of this
52
+ deed too great for us? Must we ourselves not become authors simply to
53
+ appear worthy of it?
54
+ - apologies to Friedrich Wilhelm Nietzsche
55
+
56
+ This license implies no warranty.
57
+
58
+ =end
59
+
60
+ class CharsetMove
61
+
62
+ =begin rdoc
63
+
64
+ This method returns the version number for the CharsetMove gem.
65
+
66
+ =end
67
+
68
+ def self.version; '0.2.1'; end
69
+
70
+ =begin rdoc
71
+
72
+ The +config+ argument is a hash containing a source encoding string, a
73
+ destination encoding string, and an optional +:noclobber+ boolean value. By
74
+ default, +:noclobber+ is treated as false.
75
+
76
+ =end
77
+
78
+ def initialize(config)
79
+ @config = config
80
+ @conv = Encoding::Converter.new(@config[:from], @config[:to])
81
+ end
82
+
83
+ @encodings = Encoding.list.collect {|e| e.to_s }
84
+
85
+ =begin rdoc
86
+
87
+ This method returns an array containing supported encoding strings.
88
+
89
+ =end
90
+
91
+ def self.encodings
92
+ @encodings
93
+ end
94
+
95
+ =begin rdoc
96
+
97
+ This method returns an alphabetically sorted array containing supported
98
+ encoding strings.
99
+
100
+ =end
101
+
102
+ def self.sorted_encodings
103
+ @encodings.sort {|a,b| a.upcase <=> b.upcase }
104
+ end
105
+
106
+ =begin rdoc
107
+
108
+ This method checks the current directory for filneames matching the +name+
109
+ argument string and adds a number to the end to differentiate the string from
110
+ the names of existing files.
111
+
112
+ =end
113
+
114
+ def unclobber_name(name)
115
+ n = 0
116
+
117
+ if File.exists? name
118
+ while File.exists? name
119
+ n += 1
120
+
121
+ num = n
122
+ name.chop! if num > 1
123
+ while (num / 10) > 0
124
+ name.chop! if num > 10
125
+ num = num / 10
126
+ end
127
+
128
+ name += n.to_s
129
+ end
130
+ end
131
+
132
+ name
133
+ end
134
+
135
+ =begin rdoc
136
+
137
+ This method transcodes a string representing a filename and renames a file from
138
+ the source encoding specified by +@config[:from]+ to the destination encoding
139
+ specified by +@config[:to]+ to a new encoding. It optionally differentiates
140
+ destination filenames from existing filenames by sending the +unclobber_name+
141
+ message.
142
+
143
+ =end
144
+
145
+ def transcode_filename(old_filename)
146
+ begin
147
+ new_filename = @conv.convert old_filename
148
+ rescue Exception => e
149
+ if old_filename.match(/[^[:ascii:]]/)
150
+ new_filename = @conv.convert old_filename.gsub(/[^[:ascii:]]/, '?')
151
+ else
152
+ return e
153
+ end
154
+ end
155
+
156
+ new_filename = unclobber_name(new_filename) if @config[:noclobber]
157
+
158
+ FileUtils.mv old_filename, new_filename
159
+
160
+ "#{old_filename} -> #{new_filename}"
161
+ end
162
+ end
@@ -0,0 +1,34 @@
1
+ # coding: utf-8
2
+ require 'test/unit'
3
+ require '../lib/charset_move'
4
+
5
+ class CharsetMoveTests < Test::Unit::TestCase
6
+ def setup
7
+ @name = 'tfile_bëcëdäföñü'
8
+ @cmv1 = CharsetMove.new( {
9
+ :from => 'UTF-8',
10
+ :to => 'ASCII-8BIT',
11
+ :noclobber => true
12
+ } )
13
+
14
+ @cmv2 = CharsetMove.new( {
15
+ :from => 'ASCII-8BIT',
16
+ :to => 'UTF-8',
17
+ :noclobber => true
18
+ } )
19
+
20
+ `./manage clean`
21
+ `./manage reset`
22
+ end
23
+
24
+ def teardown
25
+ `./manage clean`
26
+ end
27
+
28
+ def test_unclobber_name
29
+ assert_equal(
30
+ @cmv2.unclobber_name(@cmv1.unclobber_name(@name)),
31
+ 'tfile_bëcëdäföñü1'
32
+ )
33
+ end
34
+ end
data/test/manage ADDED
@@ -0,0 +1,26 @@
1
+ #!/usr/bin/env ruby
2
+ # coding: utf-8
3
+
4
+ case ARGV[0]
5
+ when 'clean'
6
+ system "ls; rm tfile_*; ls"
7
+ when 'reset'
8
+ system "ls; rm tfile_*; ls"
9
+
10
+ vowels = %w{ä ë ï ö ü}
11
+ vowels.each do |a|
12
+ vowels.each do |e|
13
+ vowels.each do |i|
14
+ vowels.each do |o|
15
+ vowels.each do |u|
16
+ vowels.each do |y|
17
+ system("echo 'foo' > tfile_b#{a}c#{e}d#{i}f#{o}ñ#{u}")
18
+ end
19
+ end
20
+ end
21
+ end
22
+ end
23
+ end
24
+
25
+ system "ls; ls|wc"
26
+ end
metadata ADDED
@@ -0,0 +1,57 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: charset_move
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.1
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - Chad Perrin
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2012-07-12 00:00:00.000000000 Z
13
+ dependencies: []
14
+ description: ! " CharsetMove is a simple, Ruby based, copyfree licensed command
15
+ line utility\n for changing the encodings used for filenames.\n"
16
+ email: code@apotheon.net
17
+ executables:
18
+ - cmv
19
+ extensions: []
20
+ extra_rdoc_files: []
21
+ files:
22
+ - COPYING
23
+ - LICENSE
24
+ - README.md
25
+ - lib/charset_move.rb
26
+ - bin/cmv
27
+ - test/charset_move_tests.rb
28
+ - test/manage
29
+ homepage: http://cmv.fossrec.com
30
+ licenses:
31
+ - NPL
32
+ post_install_message: ! " Thank you for using CharsetMove. The \"cmv\" command
33
+ line utility can be\n used to change the encodings of filenames from one encoding
34
+ to another.\n Use the \"-h\" command line option to get more information about
35
+ how to use\n the command.\n"
36
+ rdoc_options: []
37
+ require_paths:
38
+ - lib
39
+ required_ruby_version: !ruby/object:Gem::Requirement
40
+ none: false
41
+ requirements:
42
+ - - ! '>='
43
+ - !ruby/object:Gem::Version
44
+ version: '0'
45
+ required_rubygems_version: !ruby/object:Gem::Requirement
46
+ none: false
47
+ requirements:
48
+ - - ! '>='
49
+ - !ruby/object:Gem::Version
50
+ version: '0'
51
+ requirements: []
52
+ rubyforge_project:
53
+ rubygems_version: 1.8.15
54
+ signing_key:
55
+ specification_version: 3
56
+ summary: CharsetMove - filename transcoder
57
+ test_files: []