[ic] UTF-8 Advantages/Disadvantages

Gert van der Spoel gert at 3edge.com
Sat May 29 13:26:18 UTC 2010


> On May 28, 2010, at 11:00 AM, Chris Keane wrote:
> 
> >
> > Hey all,
> >
> > What are the advantages of running with UTF-8 on an english language
> site? The main reason I ask is that we're seeing truly significant
> performance hits on 5.7.2 with UTF enabled vs disabled.
> >
> > Our test:
> > 	• we run our entire system in the Amazon Compute Cloud
> > 	• The main production IC layer server is configured as a CPU-
> heavy machine (2.5 cores at 2.5GHz each)
> > 	• Some of the pages, especially those with nested loops are
> atrocious. The same code on one of our older servers running 5.4 runs a
> factor of 10x faster.
> > 	• For testing purposes and to keep the same environment, I booted
> a new instance of the exact same server as the production server. It
> uses the exact same DB backend server, the catalogs were cloned,
> for a race event, broken down by entries, classes and produces some
> nice graphs. This page (and some of our others) use multiple subloops,
> which I know introduces performance issues.
> > 	• Test system (no UTF-8): 5 seconds, Production system (UTF-8):
> 49 seconds
> >
> > As you can see, 5 seconds vs 49 seconds is significant and I'm sure
> you appreciate that it's the difference between a quietly happy
> customer and a wildly dissatisfied one. So we'll be disabling UTF-8 on
> the production server today in time for this weekend's race events.
> >
> > Perl 5.10.0
> > Encode 2.39
> > IC 5.7.2
> >
> > To help us make a good decision going forward can someone explain the
> relative merits of UTF-8 in an english-only site? Alternately, any
> updates into how to fix the horrible slowness of the UTF-8 enabled
> site, preferably through config changes or updates rather than by
> rewriting all the loops ;)
>
> I'm very interested in this discussion also. I have not been able to
> run any 5.7.x versions of Interchange in production. I see similar
> increases in load time compared with 5.4 which we currently run. I'm
> not sure if we've narrowed it down to a UTF-8 issue. We make websites
> for wine merchants and have words like "Château" all over the place.
> This make me think we need UTF-8 enabled but I could be mistaken. The
> whole UTF-8 thing is very confusing to me.

â is part of ISO-8859-1 so you do not need to worry about those characters.
If your only characters are in this list they are covered without the need
of UTF8:
http://htmlhelp.com/reference/charset/latin1.gif

I do not think that you *have* to use UTF8, I think before it existed sites
like yours have been working fine.

I am afraid that the key for speeding up the loops at the moment, is to
rewrite them.

CU,

Gert






More information about the interchange-users mailing list