LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] ñ ñ ç ç

[ILUG] ñ ñ ç ç

Seán Mac Suibhne smacsuibhne1 at eircom.net
Sat Aug 20 17:20:14 IST 2005


Hola!


Ar Sat 20 Aug 2005 16:27, do scríobh greg wm :
> hi folks,
>
> feels rather like i've ventured into uncharted territory, but somebody
> out there somewhere must know the way..
>
> i used wget to copy the entire http://nonviolentpeaceforce.org site to
> http://nvpf.org/np.  the former is asp pages, the latter captured as html.
>
> for example, http://nonviolentpeaceforce.org/spanish/welcome.asp was
> captured to http://nvpf.org/np/spanish/welcome.asp.html
>
> as you can see, the capture is mostly fine, including spanish characters
> in the text (eg año), however the spanish characters in the menus didn't
> do quite so well (eg Misi?n)
>
> in the file año appears as año which is apparently "good", but
> Misi?n appears as Misión, which is apparently "bad".
>
> first question:  why is that bad?
>
> if i tell galeon, instead of automatic encoding, use western iso-8859-1,
> or any of many others, presto, the page appears nicely.  but i don't
> have to do that to see the original, nor do i have to do that for
> anybody else's pages, and of course i can't expect our audience to go
> and fiddle with that in their browsers.
>
> but really now, why isn't an ó an ó?  right after the title the file
> says <meta http-equiv="Content-Type" content="text/html;
> charset=iso-8859-1">.  why isn't that good enough?  do i need to change
> some directive or setting in apache?

In Firefox the page is displaying in utf8 and when you set the coding to 
iso-8859-1 then the accents are displayed correctly.

To solve the problem 
1 	find out why the page is being displayed as utf8
or
2	Change the accented characters to &ntilde; format.


>
> second question:  it looks like wget was inconsistent!  why?
>
> likely hint:  the menus were rendered out of some .asp database or
> whatever, differently than the rest of the text of the page.
>
> but so what?  why didn't wget capture something identical to what my
> browser shows?  the command i ran was
> wget -ENKkrl19 -nH -w2 -owget.log http://nonviolentpeaceforce.org
>
> so anyway i sez hey no problem, i'll just find and replace.  well ha.
> couldn't get either egrep nor sed to find an  that was right under
> their noses.
>
> third question:  what's the trick to find and replace these buggers?
> vim can find them, in interactive mode, so.. should i be trying to
> figger out how to use vim as a grep replacement.. uhh.. ..?

I use kwrite. Highlight the  ñ and away you go.

>
> fourth question:  where should i be asking these questions, or, where do
> i look for the mysterical solution, and will i recognize it when i see it?
>

When you find out tell the list!

At least in Irish I only have to worry about 10 accented characters, just goes 
to show there is always someone worse off than yourself!

Slán

Seán
Lord and master of WWW.IONAD.ORG



More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell