MySQL UTF8 issue
I was asked to identify an issue between our PHP web form and MySQL DB. The web form is based in Japanese of UTF-8 encoding and we have all database character set and collation set to UTF8_Unicode_Ci. However, when we finish input the fields in Japanese, the data in MySQL is shown in junk character.
The weird thing is you can actually view the data in correct format if you read the data to the web page.
For example, this is what it looks like, some of them are in correct encoding but some don’t.
日本語テスト 北海é“ã¯å¯’ã„ã。 北海é“ã¯å¯’ã„ã。 ç€ä»˜ã‹ãªã„ 思い出す。 弱点 弱点 弱点 使いません。
After checking the encoding of database and web page and even ran this script “show variables “char%”;”, everything seems to be ok. However, the solution to this is to simply insert the following PHP code right after selecting the DB. If your PHP is 5.2 and above, the below line works for you otherwise you will have to use the alternatives.
PHP 5.2 and above:
mysql_set_charset('utf8');
Older PHP:
mysql_query("SET NAMES 'utf8'");
That’s it. But one thing that I noticed was the CSV export would still result in junk characters. I think CSV doesn’t support unicode anyway so I told my colleagues to export as XLS or XML instead.