I've recently been moving some data between computers with filenames
containing non-ASCII characters. The usual culprits are my Björk MP3s,
but guaranteed every time I do this I end up with some filenames
getting mangled in between.
Which leads me to the question: is it possible to give a Linux system
a unified "filename encoding" such that tools like rsync, etc. will
DTRT when copying across systems? The current copy which isn't working
right, started life like this:
- Björk cds with filenames encoded in UTF-8 in a .localized directory
on an OS X laptop.
- Used OS X distributed rsync to copy to a Linux 2.6 machine's ext2 partition.
- Used Win32 build of rsync to copy from Linuxx 2.6 ext2 partition to
an NTFS partition.
I think the filenames got messed up when copying back to NTFS, however
NTFS is a unicode-aware filesystem (it stores filenames natively in
UCS32 AFAIK). Even if this is a problem with the win32 rsync build I
used, I still don't know where to find out how rsync should be
handling filenames returned by the Linux machine.
The man pages, POSiX, etc., are mute on the issue. In fact there seems
to be little in the way authoritative docs concerning this. Is this
perhaps somehow dependent on LC_*/LANG settings?
I just want my copies to not mangle my filenames. I guess my question
is really: what encoding should I be using on ext2/"native UNIX
format" partitions on Linux such that most tools will Do The Right
Thing when it comes to filename encoding?
Excuse the round-about message, I'm just not sure which question I
should be asking. :)
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!