LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] Editing unicode text files.

[ILUG] Editing unicode text files.

Francis Daly francisdaly at gmail.com
Fri Feb 16 16:57:19 GMT 2007


On 16/02/07, Aine Douglas <aine.douglas at gmail.com> wrote:

> Can anyone recommend a commandline text editor that is capable of
> editing unicode text files?

vi? Or sed?

> I've got some webpages to edit which contain chinese script, and when
> I open them in vi i get long strings of @@@@@^^^???@@ etc, and its a
> pain downloading them for really small edits.

How the things are displayed on-screen may depend on your fonts
available and terminal and locale settings. Or possibly on which
version of vi you have available.

There's a 15 kB utf-8 file available at

http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt

which you may be able to use for testing.

Using the vi I have here, it displays as a series of characters like

\xe2\x80\xbe\xe2

but the ascii parts are visible (and searchable, and editable) and
when I make changes, it displays as expected (using cat or less in a
utf8-aware terminal).

All you're doing is adjusting a byte stream, and so long as your
editor knows to only add ascii characters, it should Just Work.

Unless you're trying to add non-ascii characters, in which case it has
to know to use the right encoding of unicode -- utf-8, utf-16, or
whatever the rest of the file is. That's something you'll have to tell
the editor by some external means, probably locale.

Good luck,

	f



More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell