| Date: Fri, 16 Feb 2007 16:31:43 +0000
| From: "Aine Douglas" <aine.douglas at gmail.com>
|
| Can anyone recommend a commandline text editor that is capable of
| editing unicode text files?
|
| I've got some webpages to edit which contain chinese script, and when
| I open them in vi i get long strings of @@@@@^^^???@@ etc, and its a
| pain downloading them for really small edits.
I don't quite grok what it is you want to do?
First, “Unicode” is ambiguous to the point of meaningless;
what matters is the encoding, not what is encoded.
( Briefly: Every character is in the UCS (Universal
Character Set, ISO-10646, also called “Unicode”†).
A character's binary representation is an encoding.
US-ASCII, e.g., is the first 128 charaters of the UCS;
ISO-8859-1 is the first 256; ISO-8859-15 is a slightly
different set of 256; UTF-8 is all two billion; and
there are many other encodings. )
Second, how will the editor be used without downloading
the files in question?
And third, by “command line” do you mean something like
sed(1), or just an editor you can launch from the shell
(like the vi(1) mentioned?).
Editors that can handle the full UCS/Unicode in a variety
of encodings include vim(1), mined, and yudit. Some other
editors, such as joe(1), handle UTF-8 but not necessarily
an arbitrary encoding.
I've only used `vim' in anger (in several senses! ;-) ):
`vim', at least, will autodetect the file's encoding and
map it to yer locale's, and hence you can use `vim' to
edit a SJIS file on a UTF-8 system. The file is saved
in its original encoding. Almost needless to say, this
mapping works best if the system/locale uses UTF-8 (on
Linux), since UTF-8 round-trips the full UCS. Result is,
provided you are displaying UTF-8 correctly (mostly a
matter of fonts), `vim' works quite well (albeit keying
in non-keyboard characters can be a pain: I tend to use
gucharmap(1) and copy-and-paste).
cheers!
-blf-
† Pedantically, “Unicode” means three different things,
and is not a synonym for the UCS.
--
Experienced (>25 yrs) kernel/software Eng: | Brian Foster Montpellier,
• Unix, embedded, &tc; • Linux; • doc; | blf at utvinternet.ie FRANCE
• IDL, automated testing, process, &tc. | Stop E$$o (ExxonMobile)!
Résumé (CV) http://www.blf.utvinternet.ie | http://www.stopesso.com
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!