| Date: Sat, 17 Feb 2007 15:22:40 +0000
| From: Kae Verens <kae at verens.com>
|
| Aine Douglas wrote:
| >[ Brian Foster wrote ]:
| >> Editors that can handle the full UCS/Unicode in a variety
| >> of encodings include vim(1), mined, and yudit. Some other
| >> editors, such as joe(1), handle UTF-8 but not necessarily
| >> an arbitrary encoding.
| >
| > On my shell account, I have vim and joe, both render garbage.
| > Will get mined and yudit later and test.
|
| vim is usually quite good about that - as long as the console can
| display the characters, it should work okay. I just opened up the
| Russian language file for KFM in vi and vim, and both worked fine.
| This was in Konsole; KDE's terminal emulator. I have had trouble
| with charsets in xterm and many other terms, so make sure that's
| not a problem first.
The first trick to using any X terminak is to ensure
the font is adequate; broadly, this (seems to) mean
an ISO-10646 font.
And then, for xterm(1) specifically, ensure it is in
UTF-8 mode.
In addition to KDE konsole and xterm (both work fine
for me), there is also mlterm(8), and I believe recent
versions of rxvt(1) are also UTF-8 capable.
I concur with Kae's point here: Until you can simply
cat(1) the file and see what you _should_ see, things
are not set up correctly.
| Also, UTF-8 files, which I presume you're talking about, usually
| start with a single marker character to distinguish them from
| otherwise-plain-text files. If that marker character is missing,
| vim may not be figuring out the charset correctly.
NO (and yes): Micro$oft UTF-8 files do tend to start
with a BOMb, but no-one else's does. The BOMb is
never needed, not even on Windross (for UTF-8).
In any case, if vim(1) is confused, simpy set the
fileencoding (`:help fileencoding' for details).
Having said that, I understand the (HTML) files in
question were written by an M$ thingie on Windross
as “Unicode” — which very probably means they are
encoded as UTF-16LE.
I was, just now, able to edit a UTF-16LE version of
this reply using:
vim --cmd 'set fileencodings=utf-16le' ...
cheers!
-blf-
--
Experienced (>25 yrs) kernel/software Eng: | Brian Foster Montpellier,
• Unix, embedded, &tc; • Linux; • doc; | blf at utvinternet.ie FRANCE
• IDL, automated testing, process, &tc. | Stop E$$o (ExxonMobile)!
Résumé (CV) http://www.blf.utvinternet.ie | http://www.stopesso.com
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!