[docs] xmldocs - docelic modified 2 files

Sat Sep 3 13:09:48 EDT 2005

User:      docelic
Date:      2005-09-03 17:09:47 GMT
Modified:  .        Makefile
Added:     guides   optimization.xml
Log:
- Add 'optimization' guide. Basically it's the content from
  "Optimizing lists" section in old icfaq.

Revision  Changes    Path
1.71      +1 -1      xmldocs/Makefile


rev 1.71, prev_rev 1.70
Index: Makefile
===================================================================
RCS file: /var/cvs/xmldocs/Makefile,v
retrieving revision 1.70
retrieving revision 1.71
diff -u -r1.70 -r1.71

--- Makefile	3 Sep 2005 14:01:23 -0000	1.70
+++ Makefile	3 Sep 2005 17:09:47 -0000	1.71
@@ -13,7 +13,7 @@
 #############################################################
 # Base definitions
 SYMBOL_TYPES= pragmas vars tags confs filters
-GUIDES      = iccattut programming-style upgrade faq index
+GUIDES      = iccattut programming-style upgrade faq index optimization
 HOWTOS      = howtos
 GLOSSARY    = glossary
 ALL_DOCS    = $(GLOSSARY) $(HOWTOS) $(GUIDES) $(SYMBOL_TYPES)



1.1                  xmldocs/guides/optimization.xml


rev 1.1, prev_rev 1.0
Index: optimization.xml
===================================================================
<?xml version="1.0" standalone="no"?>

<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook-Interchange XML V4.2//EN"
	"../docbook/docbookxi.dtd">

<article id='optimization'>

<articleinfo>
	<title>Interchange Guides: Optimization</title>
	<titleabbrev>optimization</titleabbrev>

	<copyright>
		<year>2003</year><year>2004</year><year>2005</year>
		<holder>Interchange Development Group</holder>
	</copyright>
	<copyright>
		<year>2002</year>
		<holder>Red Hat, Inc.</holder>
	</copyright>

	<authorgroup>
		<author>
			<firstname>Davor</firstname><surname>Ocelic</surname>
			<email>docelic at icdevgroup.org</email>
		</author>
		<author>
			<firstname>Mike</firstname><surname>Heins</surname>
			<email>mike at perusion.com</email>
		</author>
	</authorgroup>


	<legalnotice>
		<para>
		This documentation is free; you can redistribute it and/or modify
		it under the terms of the &GNU; General Public License as published by
		the Free Software Foundation; either version 2 of the License, or
		(at your option) any later version.
		</para>
		<para>
		It is distributed in the hope that it will be useful,
		but WITHOUT ANY WARRANTY; without even the implied warranty of
		MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
		GNU General Public License for more details.
		</para>
	</legalnotice>

	<abstract>
		<para>
		</para>
	</abstract>

</articleinfo>


<sect1>
	<title>Software optimizations</title>

	<sect2>
		<title>Interchange</title>

		<sect3>
			<title>General-purpose benchmarking</title>

			<para>
			One of the most simple and straightforward methods to check whether
			the code is able to complete a task within a reasonable time is
			<emphasis>benchmarking</emphasis>. For the purpose, &IC; offers
			the &tag-benchmark; tag which is found in the 
			<filename class='directory'>eg/usertag/</filename> directory of 
			the Interchange &glos-tarball; distribution.
			<!-- TODO my benchmark variant -->
			</para><para>
			The &tag-benchmark; reference page contains all the relevant 
			installation and usage notes.
			</para>
		</sect3>

		<sect3>
			<title>Optimizing lists</title>

			<para>
			&IC; has powerful capabilities (such as searching) that allow
			you to produce lists of items for use in category lists, product lists,
			indexes, and other navigation tools or data reports.
			</para><para>
			These are a two-edged sword, though. Lists of hundreds or thousands of
			entries can be returned, and techniques that work well displaying only
			a few items may slow to a crawl when a large list is returned.
			</para><para>
			In general, when you are displaying only one item (such as on a
			&glos-flypage;) or a small list (such as shopping cart contents),
			you can be pretty carefree in your use of &glos-ITL; tags.
			When there are thousands of items, though, you cannot; each
			&glos-ITL; tag requires parsing and argument building, and all
			complex tests or embedded &PERL; blocks cause the
			<classname>Safe</classname> module to evaluate code.
			</para><para>
			The <classname>Safe</classname> module is pretty fast considering
			what it does, but it can only generate a few thousand instances per
			second even on a fast system. And the &glos-ITL; tag
			parser can likewise only parse thousands of tags per CPU second.
			</para><para>
			What to do? You want to provide complex conditional tests but you
			don't want your system to slow to a crawl. Luckily, there are
			techniques which can speed up complex lists by orders of magnitude.
			</para>

			<sect4>
				<title>[PREFIX-tag]</title>
				<para>
				<code>[<replaceable>PREFIX</replaceable>-tag]</code> constructs
				are the fastest way to retrieve loop data. Let's say we want to 
				find all our products (search in all &conf-ProductFiles; databases)
				and display descriptions of all the products found:
<programlisting><![CDATA[
[loop prefix=foo search="ra=yes"]

  [foo-data products description]
  [comment]is slightly faster than   [/comment]

  [foo-field description]
  [comment]which is MUCH faster than [/comment]

  [data products description [foo-code]]
  [comment]which is faster than      [/comment]

  [data table=products column=description key="[foo-code]"]

[/loop]
]]></programlisting>

				The loop tags are interpreted by means of fast regular expression
				scans of the loop container text, and fetch an entire row of
				data in one query.
				</para><para>
				The &tag-data; ITL tag interpretation is
				delayed until after the loop is finished, whereby the &glos-ITL; tag
				parser must find the tag, build a parameter list, and then fetch the
				data with a separate query.
				</para><para>
				If there are repeated references to the same field in the loop,
				the speedup can be 10x or more.
				</para>
			</sect4>

			<sect4>
				<title>Pre-fetch data</title>
				<!-- TODO diff between loop-field and loop-param ? -->

				<para>
				The <mv>mv_return_fields</mv> variable (otherwise known as the
				"<literal>rf</literal>" parameter in one-click terminology) defines
				a comma-separated list of fields you want returned from a search.
				This, in effect, kind of pre-fetches the data you want to use within
				a loop.
				</para><para>
				Once the records are returned, the fields can be accessed using the
				<code>[<replaceable>PREFIX</replaceable>-param <replaceable>field</replaceable>]</code> syntax.
				The fields can also be referenced using <code>[<replaceable>PREFIX</replaceable>-pos <replaceable>N</replaceable>]</code>,
				where the <replaceable>N</replaceable> represents the ordinal position
				(starting from <literal>0</literal>) in the field list.
				</para><para>
				That said, the following are equivalent in effect but the 
				<emphasis role='bold'>second variant is much, much faster</emphasis>:
<programlisting><![CDATA[
<pre>

Benchmark loop-field list: [benchmark start=1]
  <!-- [loop search="ra=yes/st=db"]
       [loop-code] price: [loop-field price] [/loop] -->
       TIME: [benchmark]

Benchmark loop-param list: [benchmark start=1]
  <!-- [loop search="ra=yes/st=db/rf=sku,price"]
       [loop-code] price: [loop-param price] [/loop] -->
       TIME: [benchmark]

</pre>
]]></programlisting>
				</para>
			</sect4>

			<sect4>
				<title>Row counting and display</title>

				<para>
				<code>[<replaceable>PREFIX</replaceable>-alternate <replaceable>N</replaceable>]</code> can be used for row counting and display.
				</para><para>
				A common need when building tables is to conditionally close the table
				row or data containers. I see a lot of code that manually inserts
				new rows every three columns:
<programlisting><![CDATA[
[loop search="ra=yes"]
  [calc] return '<tr>' if [loop-increment] == 1; return[/calc]
  [calc] return '' if [loop-increment] % 3; return '</tr>' [/calc]
[/loop]
]]></programlisting>

				Much faster, by a few orders of magnitude than the above, is:

<programlisting><![CDATA[
[loop search="ra=yes"]
  [loop-change 1][condition]1[/condition]<tr>[/loop-change 1]
  [loop-alternate 3]</tr>[/loop-alternate]
[/loop]
]]></programlisting>

				If you think you need to close the final row by checking the
				final count, look at this complete example done the right way:

<programlisting><![CDATA[
[loop search="ra=yes"]
  [on-match]
    <table>
    <tr>
  [/on-match]

  [list]
    <td>[loop-code]</td>
    [loop-alternate 3]</tr><tr>[/loop-alternate]
  [/list]

  [on-match]
    </tr>
    </table>
  [/on-match]

  [no-match]
    No match, sorry.
  [/no-match]
[/loop]
]]></programlisting>

				The above is a hundred times faster than anything you can build with
				multiple &tag-calc; tags.
				</para>
			</sect4>

			<sect4>
				<title>Use [PREFIX-calc] instead of [calc] or [perl]</title>

				<para>
				Using <code>[<replaceable>PREFIX</replaceable>-calc]</code>, you
				can execute the same code as with &tag-calc;, but with two benefits: 
				you will not trigger &glos-ITL; parsing, and the code will be 
				executed <emphasis>during</emphasis> the loop instead of
				after it.
				</para><para>
				The <code>[<replaceable>PREFIX</replaceable>-calc]</code> object
				has complete access to all normal embedded &PERL; objects like
				<varname>$Values</varname>, <varname>$Carts</varname>,
				<varname>$Tag</varname>, and such. If you want to access data tables
				from within the loop (such as  <database>products</database> or
				<database>pricing</database>), just call the following
				<emphasis>above</emphasis> the loop:

<programlisting><![CDATA[
[perl tables="products pricing" /]
]]></programlisting>

<!--
			<programlisting><![CDATA[
			[loop search="ra=yes"]
				[loop-calc]
					$desc = $Tag->data('products', 'description', '[loop-code]');
					$link = $Tag->page('[loop-code]');
					return "$link $desc </A>";
				[/loop-calc] <br>
			[/loop]
			]]></programlisting>
-->

				</para>
			</sect4>

			<sect4>
				<title>ADVANCED: Precompile and execute</title>

				<para>
				For repetitive routines, you can achieve a considerable savings
				in CPU by pre-compiling your embedded &PERL; code. The precompilation
				can occur either once at &conf-catalog; configuration time,
				or once at time of list execution.
				</para><para>
				When you compile routines at the time of the list execution
				(using <code>[item-sub <replaceable>NAME</replaceable>] <replaceable> ... CODE ...</replaceable> [/item-sub]</code>), only one
				<classname>Safe</classname> evaluation will be done, and every
				time the <code>[loop-exec <replaceable>NAME</replaceable>]</code>
				is called, it will be a direct call to the routine. This can be
				10 times or more faster than separate &tag-calc; calls, or 5 times
				faster than separate
				<code>[<replaceable>PREFIX</replaceable>-calc</code> calls. Here's
				an example:

<programlisting><![CDATA[
[benchmark start=1]

loop-calc:
<!--
  [loop search="st=db/fi=country/ra=yes/ml=1000"]
    [loop-calc]
      my $code = q{[loop-code]};
      return "code '$code' reversed is " . reverse($code);
    [/loop-calc]
  [/loop]
-->

[benchmark]

<p>
[benchmark start=1]

loop-sub and loop-exec:
<!--
  [loop search="st=db/fi=country/ra=yes/ml=1000"]
    [loop-sub country_compare]
      my $code = shift;
      return "code '$code' reversed is " . reverse($code);
    [/loop-sub]
    [loop-exec country_compare][loop-code][/loop-exec]
  [/loop]
-->

[benchmark]
]]></programlisting>
				</para>
			</sect4>

			<sect4>
				<title>ADVANCED: Execute and save with [query ...]</title>
				<para>
				You can run <code>[query arrayref=<replaceable>KEYNAME</replaceable> sql="<replaceable>... SQL ...</replaceable>"]</code>, which saves the
				results of the search/query in a &PERL; reference. It is then
				available in
				<varname>$Tmp->{<replaceable>KEYNAME</replaceable>}</varname>.
				</para><para>
				This is the fastest possible method to display a list. Observe:

<programlisting><![CDATA[
[set waiting_for]os28004[/set]

[benchmark start=1] Query plus embedded Perl
<!--
  [query arrayref=myref sql="select sku,price,description from products" /]
  
  [perl]
    # Get the query results, has multiple fields
    my $ary = $Tmp->{myref};
    my $out = '';
    
    foreach $line (@$ary) {
      my ($sku, $price, $desc) = @$line;
      if($sku eq $Scratch->{waiting_for}) {
        $out .= "We were waiting for this one!!!!\n";
      }
      $out .= "sku: $sku price: $price description: $desc\n";
    }
    return $out;
  [/perl]
-->

TIME: [benchmark]
<p>

[benchmark start=1] Just query

<!--
  [query list=1 sql="select sku,price,description from products"]
  
  [if scratch waiting_for eq '[sql-code]']
    We were waiting for this one!!!!
  [/if]
  
  sku: [sql-code]
  price: [sql-param price]
  desc: [sql-param description]  
  [/query]
-->

TIME: [benchmark]
]]></programlisting>
				</para>
			</sect4>
		</sect3>

		<sect3>
			<title>Take advantage of "implicit" TRUE and FALSE values</title>

			<para>
			Consider these two snippets:

<programlisting><![CDATA[
[if scratch KEY]
... do something ...
[/if]
]]></programlisting>

			and:

<programlisting><![CDATA[
[if scratch KEY == '1']
... do something ...
[/if]
]]></programlisting>

			The first variant does not require &PERL; evaluation. It simply checks
			to see if the value is blank or <literal>0</literal>, and assumes
			TRUE if it is anything but.
			</para><para>
			Of course, this requires your code to return blank or value
			<literal>0</literal> for FALSE results (instead of say,
			"<literal>No</literal>" or " "), but then we can talk about a
			20-35% speed-up.
			</para><para>
			Here's a sample program to time the results:

<programlisting><![CDATA[
Overhead: 

[benchmark start=1]

<!--
	[loop search="ra=yes"]
		[set cert][loop-field gift_cert][/set]
	[/loop]
-->

[benchmark]
<p>


"if scratch cert": 
[benchmark start=1]

<!--
[loop search="ra=yes"]
	[set cert][loop-field gift_cert][/set]
	[loop-code] [if scratch cert] YES [else] NO [/else][/if]
	[loop-code] [if scratch cert] YES [else] NO [/else][/if]
	[loop-code] [if scratch cert] YES [else] NO [/else][/if]
	[loop-code] [if scratch cert] YES [else] NO [/else][/if]
	[loop-code] [if scratch cert] YES [else] NO [/else][/if]
[/loop]
-->

[benchmark]
<p>


"if scratch cert == 1": 
[benchmark start=1]

<!--
[loop search="ra=yes"]
[set cert][loop-field gift_cert][/set]
[loop-code] [if scratch cert == 1] YES [else] NO [/else][/if]
[loop-code] [if scratch cert == 1] YES [else] NO [/else][/if]
[loop-code] [if scratch cert == 1] YES [else] NO [/else][/if]
[loop-code] [if scratch cert == 1] YES [else] NO [/else][/if]
[loop-code] [if scratch cert == 1] YES [else] NO [/else][/if]
[/loop]
-->

[benchmark]
<p>

[page @@MV_PAGE@@]Run again</a>
]]></programlisting>
			</para>
		</sect3>

		<sect3>
			<title>Interpolation and reparsing</title>
			<para>
			Avoid <literal>interpolate=1</literal> and 
			<literal>reparse=1</literal> whenever possible. A separate tag parser
			must be spawned every time you do this. Many times people use this
			without needing it.
			</para>
		</sect3>

		<sect3>
			<title>Session variables</title>

			<para>
			Avoid saving large values to &glos-scratch; space, as these will be
			written to the users session. If you need them only for the current
			page, use &tag-tmpn; and &tag-tmp; instead of &tag-set; and &tag-seti;
			(temporary variables are automatically deleted at the end of current 
			page processing - before the user's session is saved).
			</para><para>
			You can also retrieve values using <code>[scratchd <replaceable>VARIABLE_NAME</replaceable>]</code>
			to return the contents and delete them from the session at the same
			time.
			</para>
		</sect3>
	</sect2>
</sect1>

</article>