charset_move 0.2.1

Sign up to get free protection for your applications and to get access to all the features.
data/COPYING ADDED
@@ -0,0 +1,24 @@
1
+ # Copyright and Copying Information
2
+
3
+ CharsetMove is Copyright 2012 Chad Perrin. It may be distributed under the
4
+ terms of the Nietzsche Public License. See LICENSE file for details.
5
+
6
+ ## Dependencies
7
+
8
+ CharsetMove depends only on a standard Ruby runtime and its standard library.
9
+ The reference implementation of Ruby is MRI/YARV, disjunctively dual-licensed
10
+ under the terms of the Ruby License and the Simplified BSD License. The same
11
+ licensing terms apply to the standard library.
12
+
13
+ ## Licensing Philosophy
14
+
15
+ The intent in selecting the Nietzsche Public License (or NPL) is to keep
16
+ licensing simple, and to inject a modicum of philosophical humor into the
17
+ otherwise dry and serious matter of copyright licensing. The NPL is a
18
+ [copyfree](http://copyfree.org) license which, like many copyfree licenses, is
19
+ [usage optimized](http://univacc.net/?page=license_simplicity) -- that is,
20
+ designed to minimize friction imposed on the development and distribution
21
+ process by legal issues. The Simplified BSD License offered as one of two
22
+ licensing options for MRI/YARV is another simple copyfree license. The goal is
23
+ to ensure that you can use the software however you like with a minimum of
24
+ fuss, risk, and bureaucratic overhead.
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ Nietzsche Public License v0.5
2
+
3
+ Copyright 2012 Chad Perrin
4
+
5
+ Copyright, like God, is dead. Let its corpse serve only to guard against its
6
+ resurrection. You may do anything with this work that copyright law would
7
+ normally restrict, so long as you retain the above notice(s), this license, and
8
+ the following misquote and disclaimer of warranty with all redistributed copies
9
+ and derived works. You may also replace this license with the Open Works
10
+ License, available at the http://owl.apotheon.org website.
11
+
12
+ Copyright is dead. Copyright remains dead, and we have killed it. How
13
+ shall we comfort ourselves, the murderers of all murderers? What was
14
+ holiest and mightiest of all that the world of censorship has yet owned has
15
+ bled to death under our knives: who will wipe this blood off us? What
16
+ water is there for us to clean ourselves? What festivals of atonement,
17
+ what sacred games shall we have to invent? Is not the greatness of this
18
+ deed too great for us? Must we ourselves not become authors simply to
19
+ appear worthy of it?
20
+ - apologies to Friedrich Wilhelm Nietzsche
21
+
22
+ This license implies no warranty.
data/README.md ADDED
@@ -0,0 +1,42 @@
1
+ # CharsetMove
2
+
3
+ A trivial series of coincidences led to me wanting to create a simple, Ruby
4
+ based, copyfree licensed tool for changing the encodings used for filenames.
5
+
6
+ ## cmv
7
+
8
+ The executable command line utility `cmv` is simple enough to use. To change
9
+ the names of all files in the current directory from UTF-8 to 8-bit ASCII,
10
+ without any chance of a filename overwriting an existing file (including
11
+ possibly multiple files being written to the same filename after transcoding
12
+ filenames):
13
+
14
+ cmv -nf UTF-8 -t ACII-8BIT *
15
+
16
+ The `-n` option is "noclobber" (thus using numbers at the ends of filenames to
17
+ avoid files overwriting each other; default behavior simply overwrites existing
18
+ files, just like `mv` would on standard Unix systems), the `-f` is "from" (the
19
+ source encoding), and `-t` is "to" (the target encoding). The program tries to
20
+ set sensible defaults for from and to, but they will both probably end up with
21
+ the same encoding if you do not specify an encoding for at least one of the
22
+ two.
23
+
24
+ If you are not sure how to specify an encoding, a list of supported encodings
25
+ can be displayed with the `-l` option. It is likely to be long; you may want
26
+ to pipe the output to a pager such as `less`, or narrow down the displayed
27
+ output by piping it to a `grep` command.
28
+
29
+ Use the `--help` or `-h` option to learn more about how to use `cmv`.
30
+
31
+ ## installing
32
+
33
+ For now, just copying the `cmv` file into your execution path somewhere should
34
+ suffice for "installation". At some point, it will probably be turned into a
35
+ really minimal gem so you can use a `gem install` command.
36
+
37
+ ## credits
38
+
39
+ This program was written by Chad Perrin, Copyright 2012. It can be
40
+ redistributed under the terms of the Nietzsche Public License for now (see
41
+ LICENSE file). This may change at some point -- probably to the Open Works
42
+ License, which the NPL allows you to use in place of the NPL itself anyway.
data/bin/cmv ADDED
@@ -0,0 +1,104 @@
1
+ #!/usr/bin/env ruby
2
+ require 'optparse'
3
+ require 'charset_move'
4
+
5
+ =begin
6
+
7
+ Nietzsche Public License v0.5
8
+
9
+ Copyright 2012 Chad Perrin
10
+
11
+ Copyright, like God, is dead. Let its corpse serve only to guard against its
12
+ resurrection. You may do anything with this work that copyright law would
13
+ normally restrict, so long as you retain the above notice(s), this license, and
14
+ the following misquote and disclaimer of warranty with all redistributed copies
15
+ and derived works. You may also replace this license with the Open Works
16
+ License, available at the http://owl.apotheon.org website.
17
+
18
+ Copyright is dead. Copyright remains dead, and we have killed it. How
19
+ shall we comfort ourselves, the murderers of all murderers? What was
20
+ holiest and mightiest of all that the world of censorship has yet owned has
21
+ bled to death under our knives: who will wipe this blood off us? What
22
+ water is there for us to clean ourselves? What festivals of atonement,
23
+ what sacred games shall we have to invent? Is not the greatness of this
24
+ deed too great for us? Must we ourselves not become authors simply to
25
+ appear worthy of it?
26
+ - apologies to Friedrich Wilhelm Nietzsche
27
+
28
+ This license implies no warranty.
29
+
30
+ =end
31
+
32
+ help_text = {
33
+ :from => 'Specify a source encoding.',
34
+ :to => 'Specify a destination encoding.',
35
+ :noclobber => 'Do not clobber existing files.',
36
+ :list => 'Display supported encodings.',
37
+ :help => 'Display this help text.',
38
+ :version => 'Display version and license information.'
39
+ }
40
+
41
+ @usage = <<EOF
42
+
43
+ USAGE: #{File.basename $0} [options] FILE [FILE [. . .]]
44
+
45
+ EOF
46
+
47
+ version_help = <<EOF
48
+
49
+ CharsetMove #{CharsetMove.version}, Copyright 2012 Chad Perrin
50
+ May be distributed under the terms of the Nietzsche Public License.
51
+
52
+ EOF
53
+
54
+
55
+ @options = {
56
+ :from => Encoding.default_external,
57
+ :to => Encoding.default_external,
58
+ :noclobber => false
59
+ }
60
+
61
+ OptionParser.new do |opts|
62
+ opts.banner = @usage
63
+
64
+ opts.on('--from=ENCODING', '-f', help_text[:from]) do |arg|
65
+ @options[:from] = arg
66
+ end
67
+
68
+ opts.on('--to=ENCODING', '-t', help_text[:to]) do |arg|
69
+ @options[:to] = arg
70
+ end
71
+
72
+ opts.on('--noclobber', '-n', help_text[:noclobber]) do
73
+ @options[:noclobber] = true
74
+ end
75
+
76
+ opts.on('--list', '-l', help_text[:list]) do
77
+ puts CharsetMove.sorted_encodings
78
+ exit(0)
79
+ end
80
+
81
+ opts.on('--help', '-h', help_text[:help]) do
82
+ puts opts
83
+ puts
84
+ exit(0)
85
+ end
86
+
87
+ opts.on_tail('--version', help_text[:version]) do
88
+ puts version_help
89
+ exit(0)
90
+ end
91
+ end.parse!
92
+
93
+
94
+
95
+ if 0 < ARGV.size
96
+ cmv = CharsetMove.new(@options)
97
+ ARGV.each do |old_filename|
98
+ puts cmv.transcode_filename(old_filename)
99
+ end
100
+ else
101
+ puts @usage
102
+ puts %Q{ Try "#{File.basename $0} --help" for usage information.}
103
+ puts
104
+ end
@@ -0,0 +1,162 @@
1
+ require 'fileutils'
2
+
3
+ =begin rdoc
4
+
5
+ CharsetMove provides a Ruby class as a simple wrapper around Ruby's
6
+ Encoding::Converter functionality, designed for easy use by a command line
7
+ utility.
8
+
9
+ === API
10
+
11
+ require 'charset_move'
12
+
13
+ config = {
14
+ :from => 'UTF-8',
15
+ :to => 'ASCII-8BIT',
16
+ :noclobber => true
17
+ }
18
+
19
+ cmv = CharsetMove(config)
20
+
21
+ # filenames listed in ASCIIbetical order
22
+ ['file_öne', 'file_threë', 'file_twó', 'file_twö'].each do |fname|
23
+ puts cmv.transcode_filename(fname)
24
+ end
25
+
26
+ The above code produces this command line output:
27
+
28
+ file_öne -> file_?ne
29
+ file_threë -> file_thre?
30
+ file_twó -> file_tw?
31
+ file_twö -> file_tw?1
32
+
33
+ === Use and Sharing
34
+
35
+ Nietzsche Public License v0.5
36
+
37
+ Copyright 2012 Chad Perrin
38
+
39
+ Copyright, like God, is dead. Let its corpse serve only to guard against its
40
+ resurrection. You may do anything with this work that copyright law would
41
+ normally restrict, so long as you retain the above notice(s), this license, and
42
+ the following misquote and disclaimer of warranty with all redistributed copies
43
+ and derived works. You may also replace this license with the Open Works
44
+ License, available at the http://owl.apotheon.org website.
45
+
46
+ Copyright is dead. Copyright remains dead, and we have killed it. How
47
+ shall we comfort ourselves, the murderers of all murderers? What was
48
+ holiest and mightiest of all that the world of censorship has yet owned has
49
+ bled to death under our knives: who will wipe this blood off us? What
50
+ water is there for us to clean ourselves? What festivals of atonement,
51
+ what sacred games shall we have to invent? Is not the greatness of this
52
+ deed too great for us? Must we ourselves not become authors simply to
53
+ appear worthy of it?
54
+ - apologies to Friedrich Wilhelm Nietzsche
55
+
56
+ This license implies no warranty.
57
+
58
+ =end
59
+
60
+ class CharsetMove
61
+
62
+ =begin rdoc
63
+
64
+ This method returns the version number for the CharsetMove gem.
65
+
66
+ =end
67
+
68
+ def self.version; '0.2.1'; end
69
+
70
+ =begin rdoc
71
+
72
+ The +config+ argument is a hash containing a source encoding string, a
73
+ destination encoding string, and an optional +:noclobber+ boolean value. By
74
+ default, +:noclobber+ is treated as false.
75
+
76
+ =end
77
+
78
+ def initialize(config)
79
+ @config = config
80
+ @conv = Encoding::Converter.new(@config[:from], @config[:to])
81
+ end
82
+
83
+ @encodings = Encoding.list.collect {|e| e.to_s }
84
+
85
+ =begin rdoc
86
+
87
+ This method returns an array containing supported encoding strings.
88
+
89
+ =end
90
+
91
+ def self.encodings
92
+ @encodings
93
+ end
94
+
95
+ =begin rdoc
96
+
97
+ This method returns an alphabetically sorted array containing supported
98
+ encoding strings.
99
+
100
+ =end
101
+
102
+ def self.sorted_encodings
103
+ @encodings.sort {|a,b| a.upcase <=> b.upcase }
104
+ end
105
+
106
+ =begin rdoc
107
+
108
+ This method checks the current directory for filneames matching the +name+
109
+ argument string and adds a number to the end to differentiate the string from
110
+ the names of existing files.
111
+
112
+ =end
113
+
114
+ def unclobber_name(name)
115
+ n = 0
116
+
117
+ if File.exists? name
118
+ while File.exists? name
119
+ n += 1
120
+
121
+ num = n
122
+ name.chop! if num > 1
123
+ while (num / 10) > 0
124
+ name.chop! if num > 10
125
+ num = num / 10
126
+ end
127
+
128
+ name += n.to_s
129
+ end
130
+ end
131
+
132
+ name
133
+ end
134
+
135
+ =begin rdoc
136
+
137
+ This method transcodes a string representing a filename and renames a file from
138
+ the source encoding specified by +@config[:from]+ to the destination encoding
139
+ specified by +@config[:to]+ to a new encoding. It optionally differentiates
140
+ destination filenames from existing filenames by sending the +unclobber_name+
141
+ message.
142
+
143
+ =end
144
+
145
+ def transcode_filename(old_filename)
146
+ begin
147
+ new_filename = @conv.convert old_filename
148
+ rescue Exception => e
149
+ if old_filename.match(/[^[:ascii:]]/)
150
+ new_filename = @conv.convert old_filename.gsub(/[^[:ascii:]]/, '?')
151
+ else
152
+ return e
153
+ end
154
+ end
155
+
156
+ new_filename = unclobber_name(new_filename) if @config[:noclobber]
157
+
158
+ FileUtils.mv old_filename, new_filename
159
+
160
+ "#{old_filename} -> #{new_filename}"
161
+ end
162
+ end
@@ -0,0 +1,34 @@
1
+ # coding: utf-8
2
+ require 'test/unit'
3
+ require '../lib/charset_move'
4
+
5
+ class CharsetMoveTests < Test::Unit::TestCase
6
+ def setup
7
+ @name = 'tfile_bëcëdäföñü'
8
+ @cmv1 = CharsetMove.new( {
9
+ :from => 'UTF-8',
10
+ :to => 'ASCII-8BIT',
11
+ :noclobber => true
12
+ } )
13
+
14
+ @cmv2 = CharsetMove.new( {
15
+ :from => 'ASCII-8BIT',
16
+ :to => 'UTF-8',
17
+ :noclobber => true
18
+ } )
19
+
20
+ `./manage clean`
21
+ `./manage reset`
22
+ end
23
+
24
+ def teardown
25
+ `./manage clean`
26
+ end
27
+
28
+ def test_unclobber_name
29
+ assert_equal(
30
+ @cmv2.unclobber_name(@cmv1.unclobber_name(@name)),
31
+ 'tfile_bëcëdäföñü1'
32
+ )
33
+ end
34
+ end
data/test/manage ADDED
@@ -0,0 +1,26 @@
1
+ #!/usr/bin/env ruby
2
+ # coding: utf-8
3
+
4
+ case ARGV[0]
5
+ when 'clean'
6
+ system "ls; rm tfile_*; ls"
7
+ when 'reset'
8
+ system "ls; rm tfile_*; ls"
9
+
10
+ vowels = %w{ä ë ï ö ü}
11
+ vowels.each do |a|
12
+ vowels.each do |e|
13
+ vowels.each do |i|
14
+ vowels.each do |o|
15
+ vowels.each do |u|
16
+ vowels.each do |y|
17
+ system("echo 'foo' > tfile_b#{a}c#{e}d#{i}f#{o}ñ#{u}")
18
+ end
19
+ end
20
+ end
21
+ end
22
+ end
23
+ end
24
+
25
+ system "ls; ls|wc"
26
+ end
metadata ADDED
@@ -0,0 +1,57 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: charset_move
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.1
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - Chad Perrin
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2012-07-12 00:00:00.000000000 Z
13
+ dependencies: []
14
+ description: ! " CharsetMove is a simple, Ruby based, copyfree licensed command
15
+ line utility\n for changing the encodings used for filenames.\n"
16
+ email: code@apotheon.net
17
+ executables:
18
+ - cmv
19
+ extensions: []
20
+ extra_rdoc_files: []
21
+ files:
22
+ - COPYING
23
+ - LICENSE
24
+ - README.md
25
+ - lib/charset_move.rb
26
+ - bin/cmv
27
+ - test/charset_move_tests.rb
28
+ - test/manage
29
+ homepage: http://cmv.fossrec.com
30
+ licenses:
31
+ - NPL
32
+ post_install_message: ! " Thank you for using CharsetMove. The \"cmv\" command
33
+ line utility can be\n used to change the encodings of filenames from one encoding
34
+ to another.\n Use the \"-h\" command line option to get more information about
35
+ how to use\n the command.\n"
36
+ rdoc_options: []
37
+ require_paths:
38
+ - lib
39
+ required_ruby_version: !ruby/object:Gem::Requirement
40
+ none: false
41
+ requirements:
42
+ - - ! '>='
43
+ - !ruby/object:Gem::Version
44
+ version: '0'
45
+ required_rubygems_version: !ruby/object:Gem::Requirement
46
+ none: false
47
+ requirements:
48
+ - - ! '>='
49
+ - !ruby/object:Gem::Version
50
+ version: '0'
51
+ requirements: []
52
+ rubyforge_project:
53
+ rubygems_version: 1.8.15
54
+ signing_key:
55
+ specification_version: 3
56
+ summary: CharsetMove - filename transcoder
57
+ test_files: []