[interchange-bugs] [rt.icdevgroup.org #339] [import] mangles UTF-8 characters

Stefan Hornburg via RT interchange at rt.icdevgroup.org
Sat Apr 17 13:16:02 UTC 2010


Sat Apr 17 13:16:01 2010: Request 339 was acted upon.
Transaction: Ticket created by racke
       Queue: Interchange
     Subject: [import] mangles UTF-8 characters
       Owner: Nobody
  Requestors: racke at linuxia.de
      Status: new
 Ticket <URL: http://rt.icdevgroup.org/Ticket/Display.html?id=339 >


Hello,

we are in the progress to migrate two projects from a patched
5.7.1 installation to 5.7.6.

These are the relevant settings in catalog.cfg:

Variable MV_HTTP_CHARSET UTF-8
DatabaseDefault PG_ENABLE_UTF8 1

In etc/log_transaction this code creates the orderline entries:

[import table=orderline type=LINE continue=NOTES]
...
[/import]

This fails if items carry UTF-8 characters in their name/description,
e.g.:

import into orderline failed: DBD::Pg::st execute failed: ERROR:  invalid byte sequence for encoding "UTF8": 0xf1612028

Similar code using [query] instead of import works.

This was discussed on IRC yesterday, and we found that the following
two code parts probably need to examined in order to fix this bug:

Vend::Data, import_text, 303ff (writing temporary file)

if($options->{file}) {
	$fn = $options->{file};
	Vend::File::allowed_file($fn)
		or die ::errmsg("No absolute file names like '%s' allowed.\n", $fn);
}
else {
	Vend::Util::writefile($fn, $text)
		or die ("Cannot write temporary import file $fn: $!\n");
}

Vend::Table::Common, 1637ff (reading temporary file)

sub new_filehandle {
	my $fh = shift;
	binmode($fh, ":utf8") if $::Variable->{MV_UTF8};
	return $fh;
}

As a side note, avoiding the temporary file altogether in this case would be
a good idea as well :-).

Regards
         Racke

-- 
LinuXia Systems => http://www.linuxia.de/
Expert Interchange Consulting and System Administration
ICDEVGROUP => http://www.icdevgroup.org/
Interchange Development Team






More information about the interchange-bugs mailing list