charset_move 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/COPYING +24 -0
- data/LICENSE +22 -0
- data/README.md +42 -0
- data/bin/cmv +104 -0
- data/lib/charset_move.rb +162 -0
- data/test/charset_move_tests.rb +34 -0
- data/test/manage +26 -0
- metadata +57 -0
data/COPYING
ADDED
@@ -0,0 +1,24 @@
|
|
1
|
+
# Copyright and Copying Information
|
2
|
+
|
3
|
+
CharsetMove is Copyright 2012 Chad Perrin. It may be distributed under the
|
4
|
+
terms of the Nietzsche Public License. See LICENSE file for details.
|
5
|
+
|
6
|
+
## Dependencies
|
7
|
+
|
8
|
+
CharsetMove depends only on a standard Ruby runtime and its standard library.
|
9
|
+
The reference implementation of Ruby is MRI/YARV, disjunctively dual-licensed
|
10
|
+
under the terms of the Ruby License and the Simplified BSD License. The same
|
11
|
+
licensing terms apply to the standard library.
|
12
|
+
|
13
|
+
## Licensing Philosophy
|
14
|
+
|
15
|
+
The intent in selecting the Nietzsche Public License (or NPL) is to keep
|
16
|
+
licensing simple, and to inject a modicum of philosophical humor into the
|
17
|
+
otherwise dry and serious matter of copyright licensing. The NPL is a
|
18
|
+
[copyfree](http://copyfree.org) license which, like many copyfree licenses, is
|
19
|
+
[usage optimized](http://univacc.net/?page=license_simplicity) -- that is,
|
20
|
+
designed to minimize friction imposed on the development and distribution
|
21
|
+
process by legal issues. The Simplified BSD License offered as one of two
|
22
|
+
licensing options for MRI/YARV is another simple copyfree license. The goal is
|
23
|
+
to ensure that you can use the software however you like with a minimum of
|
24
|
+
fuss, risk, and bureaucratic overhead.
|
data/LICENSE
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Nietzsche Public License v0.5
|
2
|
+
|
3
|
+
Copyright 2012 Chad Perrin
|
4
|
+
|
5
|
+
Copyright, like God, is dead. Let its corpse serve only to guard against its
|
6
|
+
resurrection. You may do anything with this work that copyright law would
|
7
|
+
normally restrict, so long as you retain the above notice(s), this license, and
|
8
|
+
the following misquote and disclaimer of warranty with all redistributed copies
|
9
|
+
and derived works. You may also replace this license with the Open Works
|
10
|
+
License, available at the http://owl.apotheon.org website.
|
11
|
+
|
12
|
+
Copyright is dead. Copyright remains dead, and we have killed it. How
|
13
|
+
shall we comfort ourselves, the murderers of all murderers? What was
|
14
|
+
holiest and mightiest of all that the world of censorship has yet owned has
|
15
|
+
bled to death under our knives: who will wipe this blood off us? What
|
16
|
+
water is there for us to clean ourselves? What festivals of atonement,
|
17
|
+
what sacred games shall we have to invent? Is not the greatness of this
|
18
|
+
deed too great for us? Must we ourselves not become authors simply to
|
19
|
+
appear worthy of it?
|
20
|
+
- apologies to Friedrich Wilhelm Nietzsche
|
21
|
+
|
22
|
+
This license implies no warranty.
|
data/README.md
ADDED
@@ -0,0 +1,42 @@
|
|
1
|
+
# CharsetMove
|
2
|
+
|
3
|
+
A trivial series of coincidences led to me wanting to create a simple, Ruby
|
4
|
+
based, copyfree licensed tool for changing the encodings used for filenames.
|
5
|
+
|
6
|
+
## cmv
|
7
|
+
|
8
|
+
The executable command line utility `cmv` is simple enough to use. To change
|
9
|
+
the names of all files in the current directory from UTF-8 to 8-bit ASCII,
|
10
|
+
without any chance of a filename overwriting an existing file (including
|
11
|
+
possibly multiple files being written to the same filename after transcoding
|
12
|
+
filenames):
|
13
|
+
|
14
|
+
cmv -nf UTF-8 -t ACII-8BIT *
|
15
|
+
|
16
|
+
The `-n` option is "noclobber" (thus using numbers at the ends of filenames to
|
17
|
+
avoid files overwriting each other; default behavior simply overwrites existing
|
18
|
+
files, just like `mv` would on standard Unix systems), the `-f` is "from" (the
|
19
|
+
source encoding), and `-t` is "to" (the target encoding). The program tries to
|
20
|
+
set sensible defaults for from and to, but they will both probably end up with
|
21
|
+
the same encoding if you do not specify an encoding for at least one of the
|
22
|
+
two.
|
23
|
+
|
24
|
+
If you are not sure how to specify an encoding, a list of supported encodings
|
25
|
+
can be displayed with the `-l` option. It is likely to be long; you may want
|
26
|
+
to pipe the output to a pager such as `less`, or narrow down the displayed
|
27
|
+
output by piping it to a `grep` command.
|
28
|
+
|
29
|
+
Use the `--help` or `-h` option to learn more about how to use `cmv`.
|
30
|
+
|
31
|
+
## installing
|
32
|
+
|
33
|
+
For now, just copying the `cmv` file into your execution path somewhere should
|
34
|
+
suffice for "installation". At some point, it will probably be turned into a
|
35
|
+
really minimal gem so you can use a `gem install` command.
|
36
|
+
|
37
|
+
## credits
|
38
|
+
|
39
|
+
This program was written by Chad Perrin, Copyright 2012. It can be
|
40
|
+
redistributed under the terms of the Nietzsche Public License for now (see
|
41
|
+
LICENSE file). This may change at some point -- probably to the Open Works
|
42
|
+
License, which the NPL allows you to use in place of the NPL itself anyway.
|
data/bin/cmv
ADDED
@@ -0,0 +1,104 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
require 'optparse'
|
3
|
+
require 'charset_move'
|
4
|
+
|
5
|
+
=begin
|
6
|
+
|
7
|
+
Nietzsche Public License v0.5
|
8
|
+
|
9
|
+
Copyright 2012 Chad Perrin
|
10
|
+
|
11
|
+
Copyright, like God, is dead. Let its corpse serve only to guard against its
|
12
|
+
resurrection. You may do anything with this work that copyright law would
|
13
|
+
normally restrict, so long as you retain the above notice(s), this license, and
|
14
|
+
the following misquote and disclaimer of warranty with all redistributed copies
|
15
|
+
and derived works. You may also replace this license with the Open Works
|
16
|
+
License, available at the http://owl.apotheon.org website.
|
17
|
+
|
18
|
+
Copyright is dead. Copyright remains dead, and we have killed it. How
|
19
|
+
shall we comfort ourselves, the murderers of all murderers? What was
|
20
|
+
holiest and mightiest of all that the world of censorship has yet owned has
|
21
|
+
bled to death under our knives: who will wipe this blood off us? What
|
22
|
+
water is there for us to clean ourselves? What festivals of atonement,
|
23
|
+
what sacred games shall we have to invent? Is not the greatness of this
|
24
|
+
deed too great for us? Must we ourselves not become authors simply to
|
25
|
+
appear worthy of it?
|
26
|
+
- apologies to Friedrich Wilhelm Nietzsche
|
27
|
+
|
28
|
+
This license implies no warranty.
|
29
|
+
|
30
|
+
=end
|
31
|
+
|
32
|
+
help_text = {
|
33
|
+
:from => 'Specify a source encoding.',
|
34
|
+
:to => 'Specify a destination encoding.',
|
35
|
+
:noclobber => 'Do not clobber existing files.',
|
36
|
+
:list => 'Display supported encodings.',
|
37
|
+
:help => 'Display this help text.',
|
38
|
+
:version => 'Display version and license information.'
|
39
|
+
}
|
40
|
+
|
41
|
+
@usage = <<EOF
|
42
|
+
|
43
|
+
USAGE: #{File.basename $0} [options] FILE [FILE [. . .]]
|
44
|
+
|
45
|
+
EOF
|
46
|
+
|
47
|
+
version_help = <<EOF
|
48
|
+
|
49
|
+
CharsetMove #{CharsetMove.version}, Copyright 2012 Chad Perrin
|
50
|
+
May be distributed under the terms of the Nietzsche Public License.
|
51
|
+
|
52
|
+
EOF
|
53
|
+
|
54
|
+
|
55
|
+
@options = {
|
56
|
+
:from => Encoding.default_external,
|
57
|
+
:to => Encoding.default_external,
|
58
|
+
:noclobber => false
|
59
|
+
}
|
60
|
+
|
61
|
+
OptionParser.new do |opts|
|
62
|
+
opts.banner = @usage
|
63
|
+
|
64
|
+
opts.on('--from=ENCODING', '-f', help_text[:from]) do |arg|
|
65
|
+
@options[:from] = arg
|
66
|
+
end
|
67
|
+
|
68
|
+
opts.on('--to=ENCODING', '-t', help_text[:to]) do |arg|
|
69
|
+
@options[:to] = arg
|
70
|
+
end
|
71
|
+
|
72
|
+
opts.on('--noclobber', '-n', help_text[:noclobber]) do
|
73
|
+
@options[:noclobber] = true
|
74
|
+
end
|
75
|
+
|
76
|
+
opts.on('--list', '-l', help_text[:list]) do
|
77
|
+
puts CharsetMove.sorted_encodings
|
78
|
+
exit(0)
|
79
|
+
end
|
80
|
+
|
81
|
+
opts.on('--help', '-h', help_text[:help]) do
|
82
|
+
puts opts
|
83
|
+
puts
|
84
|
+
exit(0)
|
85
|
+
end
|
86
|
+
|
87
|
+
opts.on_tail('--version', help_text[:version]) do
|
88
|
+
puts version_help
|
89
|
+
exit(0)
|
90
|
+
end
|
91
|
+
end.parse!
|
92
|
+
|
93
|
+
|
94
|
+
|
95
|
+
if 0 < ARGV.size
|
96
|
+
cmv = CharsetMove.new(@options)
|
97
|
+
ARGV.each do |old_filename|
|
98
|
+
puts cmv.transcode_filename(old_filename)
|
99
|
+
end
|
100
|
+
else
|
101
|
+
puts @usage
|
102
|
+
puts %Q{ Try "#{File.basename $0} --help" for usage information.}
|
103
|
+
puts
|
104
|
+
end
|
data/lib/charset_move.rb
ADDED
@@ -0,0 +1,162 @@
|
|
1
|
+
require 'fileutils'
|
2
|
+
|
3
|
+
=begin rdoc
|
4
|
+
|
5
|
+
CharsetMove provides a Ruby class as a simple wrapper around Ruby's
|
6
|
+
Encoding::Converter functionality, designed for easy use by a command line
|
7
|
+
utility.
|
8
|
+
|
9
|
+
=== API
|
10
|
+
|
11
|
+
require 'charset_move'
|
12
|
+
|
13
|
+
config = {
|
14
|
+
:from => 'UTF-8',
|
15
|
+
:to => 'ASCII-8BIT',
|
16
|
+
:noclobber => true
|
17
|
+
}
|
18
|
+
|
19
|
+
cmv = CharsetMove(config)
|
20
|
+
|
21
|
+
# filenames listed in ASCIIbetical order
|
22
|
+
['file_öne', 'file_threë', 'file_twó', 'file_twö'].each do |fname|
|
23
|
+
puts cmv.transcode_filename(fname)
|
24
|
+
end
|
25
|
+
|
26
|
+
The above code produces this command line output:
|
27
|
+
|
28
|
+
file_öne -> file_?ne
|
29
|
+
file_threë -> file_thre?
|
30
|
+
file_twó -> file_tw?
|
31
|
+
file_twö -> file_tw?1
|
32
|
+
|
33
|
+
=== Use and Sharing
|
34
|
+
|
35
|
+
Nietzsche Public License v0.5
|
36
|
+
|
37
|
+
Copyright 2012 Chad Perrin
|
38
|
+
|
39
|
+
Copyright, like God, is dead. Let its corpse serve only to guard against its
|
40
|
+
resurrection. You may do anything with this work that copyright law would
|
41
|
+
normally restrict, so long as you retain the above notice(s), this license, and
|
42
|
+
the following misquote and disclaimer of warranty with all redistributed copies
|
43
|
+
and derived works. You may also replace this license with the Open Works
|
44
|
+
License, available at the http://owl.apotheon.org website.
|
45
|
+
|
46
|
+
Copyright is dead. Copyright remains dead, and we have killed it. How
|
47
|
+
shall we comfort ourselves, the murderers of all murderers? What was
|
48
|
+
holiest and mightiest of all that the world of censorship has yet owned has
|
49
|
+
bled to death under our knives: who will wipe this blood off us? What
|
50
|
+
water is there for us to clean ourselves? What festivals of atonement,
|
51
|
+
what sacred games shall we have to invent? Is not the greatness of this
|
52
|
+
deed too great for us? Must we ourselves not become authors simply to
|
53
|
+
appear worthy of it?
|
54
|
+
- apologies to Friedrich Wilhelm Nietzsche
|
55
|
+
|
56
|
+
This license implies no warranty.
|
57
|
+
|
58
|
+
=end
|
59
|
+
|
60
|
+
class CharsetMove
|
61
|
+
|
62
|
+
=begin rdoc
|
63
|
+
|
64
|
+
This method returns the version number for the CharsetMove gem.
|
65
|
+
|
66
|
+
=end
|
67
|
+
|
68
|
+
def self.version; '0.2.1'; end
|
69
|
+
|
70
|
+
=begin rdoc
|
71
|
+
|
72
|
+
The +config+ argument is a hash containing a source encoding string, a
|
73
|
+
destination encoding string, and an optional +:noclobber+ boolean value. By
|
74
|
+
default, +:noclobber+ is treated as false.
|
75
|
+
|
76
|
+
=end
|
77
|
+
|
78
|
+
def initialize(config)
|
79
|
+
@config = config
|
80
|
+
@conv = Encoding::Converter.new(@config[:from], @config[:to])
|
81
|
+
end
|
82
|
+
|
83
|
+
@encodings = Encoding.list.collect {|e| e.to_s }
|
84
|
+
|
85
|
+
=begin rdoc
|
86
|
+
|
87
|
+
This method returns an array containing supported encoding strings.
|
88
|
+
|
89
|
+
=end
|
90
|
+
|
91
|
+
def self.encodings
|
92
|
+
@encodings
|
93
|
+
end
|
94
|
+
|
95
|
+
=begin rdoc
|
96
|
+
|
97
|
+
This method returns an alphabetically sorted array containing supported
|
98
|
+
encoding strings.
|
99
|
+
|
100
|
+
=end
|
101
|
+
|
102
|
+
def self.sorted_encodings
|
103
|
+
@encodings.sort {|a,b| a.upcase <=> b.upcase }
|
104
|
+
end
|
105
|
+
|
106
|
+
=begin rdoc
|
107
|
+
|
108
|
+
This method checks the current directory for filneames matching the +name+
|
109
|
+
argument string and adds a number to the end to differentiate the string from
|
110
|
+
the names of existing files.
|
111
|
+
|
112
|
+
=end
|
113
|
+
|
114
|
+
def unclobber_name(name)
|
115
|
+
n = 0
|
116
|
+
|
117
|
+
if File.exists? name
|
118
|
+
while File.exists? name
|
119
|
+
n += 1
|
120
|
+
|
121
|
+
num = n
|
122
|
+
name.chop! if num > 1
|
123
|
+
while (num / 10) > 0
|
124
|
+
name.chop! if num > 10
|
125
|
+
num = num / 10
|
126
|
+
end
|
127
|
+
|
128
|
+
name += n.to_s
|
129
|
+
end
|
130
|
+
end
|
131
|
+
|
132
|
+
name
|
133
|
+
end
|
134
|
+
|
135
|
+
=begin rdoc
|
136
|
+
|
137
|
+
This method transcodes a string representing a filename and renames a file from
|
138
|
+
the source encoding specified by +@config[:from]+ to the destination encoding
|
139
|
+
specified by +@config[:to]+ to a new encoding. It optionally differentiates
|
140
|
+
destination filenames from existing filenames by sending the +unclobber_name+
|
141
|
+
message.
|
142
|
+
|
143
|
+
=end
|
144
|
+
|
145
|
+
def transcode_filename(old_filename)
|
146
|
+
begin
|
147
|
+
new_filename = @conv.convert old_filename
|
148
|
+
rescue Exception => e
|
149
|
+
if old_filename.match(/[^[:ascii:]]/)
|
150
|
+
new_filename = @conv.convert old_filename.gsub(/[^[:ascii:]]/, '?')
|
151
|
+
else
|
152
|
+
return e
|
153
|
+
end
|
154
|
+
end
|
155
|
+
|
156
|
+
new_filename = unclobber_name(new_filename) if @config[:noclobber]
|
157
|
+
|
158
|
+
FileUtils.mv old_filename, new_filename
|
159
|
+
|
160
|
+
"#{old_filename} -> #{new_filename}"
|
161
|
+
end
|
162
|
+
end
|
@@ -0,0 +1,34 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
require 'test/unit'
|
3
|
+
require '../lib/charset_move'
|
4
|
+
|
5
|
+
class CharsetMoveTests < Test::Unit::TestCase
|
6
|
+
def setup
|
7
|
+
@name = 'tfile_bëcëdäföñü'
|
8
|
+
@cmv1 = CharsetMove.new( {
|
9
|
+
:from => 'UTF-8',
|
10
|
+
:to => 'ASCII-8BIT',
|
11
|
+
:noclobber => true
|
12
|
+
} )
|
13
|
+
|
14
|
+
@cmv2 = CharsetMove.new( {
|
15
|
+
:from => 'ASCII-8BIT',
|
16
|
+
:to => 'UTF-8',
|
17
|
+
:noclobber => true
|
18
|
+
} )
|
19
|
+
|
20
|
+
`./manage clean`
|
21
|
+
`./manage reset`
|
22
|
+
end
|
23
|
+
|
24
|
+
def teardown
|
25
|
+
`./manage clean`
|
26
|
+
end
|
27
|
+
|
28
|
+
def test_unclobber_name
|
29
|
+
assert_equal(
|
30
|
+
@cmv2.unclobber_name(@cmv1.unclobber_name(@name)),
|
31
|
+
'tfile_bëcëdäföñü1'
|
32
|
+
)
|
33
|
+
end
|
34
|
+
end
|
data/test/manage
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# coding: utf-8
|
3
|
+
|
4
|
+
case ARGV[0]
|
5
|
+
when 'clean'
|
6
|
+
system "ls; rm tfile_*; ls"
|
7
|
+
when 'reset'
|
8
|
+
system "ls; rm tfile_*; ls"
|
9
|
+
|
10
|
+
vowels = %w{ä ë ï ö ü}
|
11
|
+
vowels.each do |a|
|
12
|
+
vowels.each do |e|
|
13
|
+
vowels.each do |i|
|
14
|
+
vowels.each do |o|
|
15
|
+
vowels.each do |u|
|
16
|
+
vowels.each do |y|
|
17
|
+
system("echo 'foo' > tfile_b#{a}c#{e}d#{i}f#{o}ñ#{u}")
|
18
|
+
end
|
19
|
+
end
|
20
|
+
end
|
21
|
+
end
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
25
|
+
system "ls; ls|wc"
|
26
|
+
end
|
metadata
ADDED
@@ -0,0 +1,57 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: charset_move
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.2.1
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Chad Perrin
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2012-07-12 00:00:00.000000000 Z
|
13
|
+
dependencies: []
|
14
|
+
description: ! " CharsetMove is a simple, Ruby based, copyfree licensed command
|
15
|
+
line utility\n for changing the encodings used for filenames.\n"
|
16
|
+
email: code@apotheon.net
|
17
|
+
executables:
|
18
|
+
- cmv
|
19
|
+
extensions: []
|
20
|
+
extra_rdoc_files: []
|
21
|
+
files:
|
22
|
+
- COPYING
|
23
|
+
- LICENSE
|
24
|
+
- README.md
|
25
|
+
- lib/charset_move.rb
|
26
|
+
- bin/cmv
|
27
|
+
- test/charset_move_tests.rb
|
28
|
+
- test/manage
|
29
|
+
homepage: http://cmv.fossrec.com
|
30
|
+
licenses:
|
31
|
+
- NPL
|
32
|
+
post_install_message: ! " Thank you for using CharsetMove. The \"cmv\" command
|
33
|
+
line utility can be\n used to change the encodings of filenames from one encoding
|
34
|
+
to another.\n Use the \"-h\" command line option to get more information about
|
35
|
+
how to use\n the command.\n"
|
36
|
+
rdoc_options: []
|
37
|
+
require_paths:
|
38
|
+
- lib
|
39
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
40
|
+
none: false
|
41
|
+
requirements:
|
42
|
+
- - ! '>='
|
43
|
+
- !ruby/object:Gem::Version
|
44
|
+
version: '0'
|
45
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
46
|
+
none: false
|
47
|
+
requirements:
|
48
|
+
- - ! '>='
|
49
|
+
- !ruby/object:Gem::Version
|
50
|
+
version: '0'
|
51
|
+
requirements: []
|
52
|
+
rubyforge_project:
|
53
|
+
rubygems_version: 1.8.15
|
54
|
+
signing_key:
|
55
|
+
specification_version: 3
|
56
|
+
summary: CharsetMove - filename transcoder
|
57
|
+
test_files: []
|