LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] Problem with UTF-8 encoded data

[ILUG] Problem with UTF-8 encoded data

Niall O Broin niall at linux.ie
Wed May 13 21:41:28 IST 2009


I have a MySQL DB with text encoded supposedly in UTF-8. This DB is  
used by a web application written in PHP. The php.ini file has this

default_charset = "utf-8"

and the text displays as it should it web browsers. All well and good  
so far.

However, it has to be transferred to another system which expects the  
text to be in latin1. Changing the other system is not possible. The  
data is exported from MySQL with
SELECT INTO $FILE and is converted to latin1 with recode. Or rather,  
WAS converted. This worked last year, but this year, recode fails to  
do the conversion.

I have created a simple test field in the database via the web UI  
which contains just

abc Ä Ö Ü ä ö ü 123

and then exported that into a text file. Trying to convert it I get:

% recode UTF8..ISO_8859-15 < /tmp/umlaut
abc Ã" Ã- ý Ãrecode: Invalid input in step `UTF-8..ISO-8859-15'

% iconv -f utf-8 -t latin1 /tmp/umlaut
abc Ãiconv: illegal input sequence at position 6

hexdump -C of the file follows:

00000000  61 62 63 20 c3 83 e2 80  9e 20 c3 83 e2 80 93 20  | 
abc ..... ..... |
00000010  c3 83 c5 93 20 c3 83 c2  a4 20 c3 83 c2 b6 20 c3   
|.... .... .... .|
00000020  83 c2 bc 20 31 32 33 0a                           |... 123.|


Can anyone suggest what kind of encoding is being used here, and how I  
can convert the files to latin1 / ISO8859-1  ?

I have use the MySQL set names command with latin, utf8, and binary -  
in every case the output file is identical.


This is an urgent and serious problem. There will be plentiful beer at  
the next POTD for a solution.



Niall





More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell