On 28 Jan 2009, at 11:19, Pádraig Brady wrote:
> Niall O Broin wrote:
>> Do you have a setting for default_charset in your php.ini? Your
>> HTTP headers say UTF-8 but you have no charet specified in meta tags.
>>>> Mind you, the corruption you referenced is a bit odd, as it's not
>> happening with non ASCII characters but with random ASCII characters.
>> It's not random.
It is. The characters being corrupted in the output are random i.e. it
the more common case one comes across where non ASCII characters are
correctly handled. You obviously looked more deeply at the problem
than I did
and came up with this
> For every non ascii character, there is a corresponding �
> at some variable offset from it.
> I.E. whatever is processing the multibyte chars is messing up
> and putting dodgy characters further on in the buffer.
> Note 2 to 3 chars are consumed for each �, so it looks like these
> dodgy characters are being interpreted again as UTF8 and
> converted to the "replacement character" \uFFFD.
which starts to explain what is happening, but doesn't affect the
fact tat the corrupted characters are random (because they are
AFTER the muti byte character in the output stream, so they could be
A tad pedantic, perhaps, but this is ILUG :-)
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!