[Date Prev][Date Next][Thread Prev][Thread Next][Minivend by date ][Minivend by thread ]

Re: Hardware and Performance

To: minivend-users@minivend.com
Subject: Re: Hardware and Performance
From: John Edstrom <edstrom@Poopsie.hmsc.orst.edu>
Date: Wed, 16 Jun 1999 11:49:40 -0700 (PDT)
Acknowledge-To: edstrom@slugo.hmsc.orst.edu
In-Reply-To: <19990615173127.A6808@bill.iac.net> from "mikeh@minivend.com" at Jun 15, 1999 05:31:27 PM

mikeh@minivend.com
> 
> ******    message to minivend-users from mikeh@minivend.com     ******
> 
> Quoting Colin Mitchell (colin@5points.net):
> > 
> > There is a lot of overhead involved in establishing the connection between
> > the SQL server and minivend.  Sockets need to be opened, etc.  In my
...
> > keep a pool of already opened connections to the server, and you use those
> > whenever possible instead of opening a new one.
> > 
> 
> That part is obvious; I can probably reduce connections by doing only
> one connection per separate database rather than one per table in a
> single catalog, hashing the connection parameters and saving handles. My
> understanding is that DBI already does this, though.

I don't think so.  As of DBI 1.10 I didn't see anything about caching
connections.  I see where there is a prepare_cached method to save
statements, but the CacheKids entry in the docs says "... statement
handles created by the (not yet implemented) connect_cached method."


So, I did a test with this code in either an Apache::DBI module run
through a browser, or with the code in a stand-alone perl program run
from the command line:

sub do_one {
	my $db = DBI->connect("dbi:mysql:test:host=localhost",
                              '', '') or die $DBI::errstr;
        my $sth = $db->prepare('select * from x');
        $sth->execute;
        $sth->finish;
        $db->disconnect;
}

for (1..5) {
    last if &do_one < 0;
}

output from the mysql log I get, running twice under Apache::DBI:  

990616  9:29:39      23 Connect    www@localhost as anonymous on test
                     23 Query      select * from x
                     23 Query      select * from x
                     23 Query      select * from x
                     23 Query      select * from x
                     23 Query      select * from x
990616  9:29:44      15 Query      select * from x
                     15 Query      select * from x
                     15 Query      select * from x
                     15 Query      select * from x
                     15 Query      select * from x


running perl script once from the command line:

990616  9:31:14      24 Connect    edstrom@localhost as anonymous on
test
                     24 Query      select * from x
                     24 Quit
                     25 Connect    edstrom@localhost as anonymous on
test
                     25 Query      select * from x
                     25 Quit
                     26 Connect    edstrom@localhost as anonymous on
test
                     26 Query      select * from x
                     26 Quit
                     27 Connect    edstrom@localhost as anonymous on
test
                     27 Query      select * from x
                     27 Quit
                     28 Connect    edstrom@localhost as anonymous on
test
                     28 Query      select * from x
                     28 Quit

DBI doesn't seem to be caching connections.

I wasn't aware that MV make one connection per table.  The one
connection per database per server that you mentioned above would help
significantly if somebody uses several tables from each database.
Connecting and disconnecting can have a significant impact.  Here is a
simple benchmark that sort of measures the overhead of establishing a
connection-per-table approach:

use strict;
use DBI;
use Benchmark;

my ($db, $sth);

timethese(200, {
    in_loop => sub {
        for (1..5) {
            $db = DBI->connect("dbi:mysql:test:host=localhost", 
                               '',  '') or die $DBI::errstr;
            $sth = $db->prepare('select * from x');
            $sth->execute;
            $sth->finish;
            $db->disconnect;
        }
    }, 
    out_loop => sub {
        $db = DBI->connect("dbi:mysql:test:host=localhost", 
                           '',  '') or die $DBI::errstr;
        for (1..5) {
            $sth = $db->prepare('select * from x');
            $sth->execute;
            $sth->finish;
        }
        $db->disconnect;
    },
    out_cached => sub { #look at prepare() overhead for a lark
        $db = DBI->connect("dbi:mysql:test:host=localhost", 
                           '',  '') or die $DBI::errstr;
        for (1..5) {
            $sth = $db->prepare_cached('select * from x');
            $sth->execute;
            $sth->finish;
        }
        $db->disconnect;
    }
});

the output: 
Benchmark: timing 200 iterations of in_loop, out_cached, out_loop...
   in_loop: 11 wallclock secs ( 5.71 usr +  1.97 sys =  7.68 CPU)
out_cached:  4 wallclock secs ( 1.83 usr +  0.48 sys =  2.31 CPU)
  out_loop:  4 wallclock secs ( 2.47 usr +  0.58 sys =  3.05 CPU)

The difference is 4.63 seconds/200 reps or 23 ms per rep.  The
difference between in_loop and out_loop is 4 connect/disconnect pairs
per rep, so the final cost is 23/4 or 5.8 ms per connect/disconnect
pair on a P120.

> 
> The part that is not obvious is how you do anything more in a forked
> server setup. Essentially, you would have to pass data through a shared
> memory space. Unless it is a threaded, and not a forked, implementation
> things are not obvious. The overhead of this could easily kill any
> savings.

too, too true.  I think that this is one or Perl's weakest points. 

The only thing I can think of is to have MV spawn a DBI server
(DBI::Proxy?).  Some overhead might be saved by having MV establish
'cheap' connections to the proxy which would establish and then
maintain persistent 'expensive' connections to the db server.  But
that just defers the problem to the proxy; how do you get the proxy to
keep more than one ball in the air at a time?

I don't see a simple solution to this within perl either.

...
> -- 
> Mike Heins                          http://www.minivend.com/  ___ 


-- 
 John Edstrom | edstrom @ slugo.hmsc.orst.edu

Follow-Ups:
- Re: Hardware and Performance
  - From: mikeh@minivend.com

References:
- Re: Hardware and Performance
  - From: mikeh@minivend.com

Prev by Date: Re: ECML
Next by Date: RE: coordinated search
Prev by thread: Re: Hardware and Performance
Next by thread: Re: Hardware and Performance
Index(es):
- Date
- Thread