[ic] Call for testers

Gert van der Spoel gert at 3edge.com
Mon Jun 22 13:28:58 UTC 2009


> -----Original Message-----
> From: interchange-users-bounces at icdevgroup.org [mailto:interchange-
> users-bounces at icdevgroup.org] On Behalf Of David Christensen
> Sent: Monday, June 22, 2009 3:44 PM
> To: interchange-users at icdevgroup.org
> Subject: Re: [ic] Call for testers
> 
> 
> On Jun 22, 2009, at 7:27 AM, Gert van der Spoel wrote:
> 
> >> -----Original Message-----
> >> From: interchange-users-bounces at icdevgroup.org [mailto:interchange-
> >> users-bounces at icdevgroup.org] On Behalf Of David Christensen
> >> Sent: Monday, June 22, 2009 3:14 PM
> >> To: interchange-users at icdevgroup.org
> >> Subject: Re: [ic] Call for testers
> >>
> >>> In the original, unpatched versions on David's tree (so the files
> >>> that are in the repository), things are actually going fine for
> >>> files such as ctry_lang.txt ... but not for locale.txt ...
> >>
> >>
> >> In my old tree I'd had a commit which updated the encodings in
> >> locale.txt; the issue IIRC was that each line would be in its own
> >> character encoding, so the file was really an amalgamation of
> >> individual lines in different encodings.
> >
> > I suppose you mean each line could contain multiple character
> > encodings,
> > Forexample:
> >
> > en_US  gr_GR         cz_CN
> > latin1  iso8859-7    gb2312
> >
> > I can imagine this to be a complete conversion hell :)
>
> Yes, that's exactly the issue.  And you're correct, it was... :-)  I
> had to write a special program to break things up and reencode
> correctly.
> 
> > And I also think that that should not be needed. If someone decides
> > that the
> > time has come to use UTF8, then a prerequisite is that the .txt
> > files are in
> > UTF8 ... If one needs suggestions how to convert them, those could be
> > provided.
> 
> +1.  I also converted the individual locale files to utf8, so I may
> end up pushing those fixes as well (although those included the
> declared charset, so should hypothetically have been able to decode
> successfully).
> 
> >> My commit updated the encoding of each line (and hence the whole
> >> file) to UTF8.  I hadn't thought that this was a necessary action,
> >> but it seems like it may be, so I'll go ahead and cherry-pick the
> >> commit to my tree and push it out.
> >
> > I'd be interested to see this patch yes. Perhaps this is not exactly
> > needed
> > to be done, but it might explain why things break for locale.txt and
> > not for
> > other .txt -> .gdbm files.
> 
> It's been pushed to my tree; I actually squashed two related patches.
> Said conversion occurred quite some time ago (October of 2008), but I
> didn't see any additional modifications to the file since, so I'm
> thinking it'll be good.

Which would be the updated file(s) in the tree? How can I get that tree?
I'm not too sure for all this git stuff yet ... And currently I do not
Seem to be seeing any differences.

I have been testing and checking around some more and found something that
is different for ctry_lang.txt and locale.txt ... 

--------------------
In case you'd like to reproduce, here you can find the 2 files I am
currently using:
http://dev.allcarmodels.com/utf8.zip

the dbconf files are:
locale.dbm
Database locale locale.txt TAB
Database locale GDBM_ENABLE_UTF8 1

ctry_lang.dbm
Database ctry_lang ctry_lang.txt TAB
Database ctry_lang GDBM_ENABLE_UTF8 1

catalog.cfg contains:
# UTF8 Variables
Variable MV_HTTP_CHARSET UTF-8
Variable MV_UTF8 1
DatabaseDefault GDBM_ENABLE_UTF8 1

# Internationalization
LocaleDatabase locale

Test page:
[setlocale gr_GR]
[msg]BUY NOW![/msg]
[msg]ΑΓΟΡΑ![/msg]
<br/><br/>
[data table=ctry_lang column='country' key='3']
--------------------

We know that the standard difference is that processing of LocaleDatabase is
done
when Interchange is restarted. The other case for files such as
ctry_lang.txt
happens on page access.

I've been trying to trace where the process becomes different and so far
they seem
to be following pretty much the same route from the point of the reading of
the
files. However when you check in lib/Vend/Table/Common.pm  in the following
sub:
sub stuff {
    my ($val) = @_;
    $val =~ s,([\t\%]),$Hex_string[ord($1)],eg;
    ::logDebug(is_utf8($val) ? 1:0);
    return $val;
}

I notice that when Interchange does a restart and you load locale.txt then a
test
If the utf8 flag is set returns 0 ... when you load the page and it kicks in
the
Loading of ctry_lang.txt, the utf8 flag check returns 1.

Does this have to do with the 2 different ways that you go through
Interchange code,
where the utf8 flag does not gets set properly when you restart it?

Will look at it more later, but if anybody has any ideas? ;)

CU,

Gert



























More information about the interchange-users mailing list