[ic] Non-US keys = UTF-8 issue?
Stefan Hornburg (Racke)
racke at linuxia.de
Fri Feb 8 09:05:43 EST 2008
Grant wrote:
>>> I have a Swedish address stored in mysql and it is full of strange
>>> characters. Some stars, some paragraph symbols, etc. Should I be
>>> declaring UTF-8 somehow in dbconf/mysql/orders.mysql to avoid this?
>>> Is there any way to recover the correct data at this point?
>>>
>> Well, first you need to isolate whether or not it's correct in the
>> database, and not through Interchange.
>>
>> MySQL has two configurations with regards to characters: character sets
>> and collation. Character sets define the encoding for the bytes, whereas
>> collation defines how to manipulate said data through sorting and other
>> means. See http://dev.mysql.com/doc/refman/5.0/en/charset-general.html
>> for more information.
>>
>> As far as Interchange goes, I'm not entirely sure what the state of UTF-8
>> support is, but from what I understand, it's less than ideal. It may not
>> be functional at all.
>>
>> If the data encoded in UTF-8 Unicode was saved in MySQL under a different
>> character set, it may not be recoverable whatsoever. Your first tactic
>> will be to attempt to change the character set for that field (or table)
>> alone, and hope MySQL has not changed it. Something like:
>>
>> ALTER TABLE foo CHARACTER SET utf8 COLLATE utf8_swedish_ci;
>>
>> should suffice. Note also that your client's encoding will need to be
>> utf8 (SET NAMES 'utf8';), as well as any terminal emulator you may be
>> using, and any software such as screen or ssh that will transmit the data.
>> I'm actually not sure that ssh needs specific instructions to encode
>> characters, but your terminal emulator (PuTTy, xterm, rxvt, etc.) will
>> definitely need proper configuration. Oh, and two more things: MySQL
>> will need to be compiled with support for utf8, and you will need to have
>> that locale available (assuming Linux).
>>
>> Gee, isn't Unicode fun?! (To be fair, it's the software support for it
>> that is lacking. Don't believe me? Try to edit a utf8 encoded file in
>> vim in screen on OpenBSD. Just *try* to.)
>
> Thanks a lot Jordan. I have confirmed that my browser does display
> UTF-8 characters properly, but the Swedish characters showed up
> incorrectly on the initial email order receipt sent to me. I checked
> and the receipt is built with [value] tags so the characters must be
> messed up as soon as they hit IC right? Because of this I don't think
> tinkering with mysql will fix the problem. Please correct me if I'm
> wrong.
Email order receipt will not be send as UTF8 charset, so it's quite
plausible that Swedish characters are messed up. Proper UTF8 support
is still under development.
Regards
Racke
--
LinuXia Systems => http://www.linuxia.de/
Expert Interchange Consulting and System Administration
ICDEVGROUP => http://www.icdevgroup.org/
Interchange Development Team
More information about the interchange-users
mailing list