[ic] Restart and stop problems (multilple processes)

Mike Heins interchange-users@icdevgroup.org
Thu Feb 6 13:47:00 2003


Quoting Daniel Hutchison (jdhutchison11@attbi.com):
> > > FWIW, I did try to look into this issue a little more in depth. However,
> > > I havn't had a whole lot of time. I havn't found anything definite yet,
> > > just some suspicions.  
> > > 
> > > >From what I can tell, the problem seems to be in the locking of the pid
> > > file.  Eg. interchange attempts to lock the pid file when it starts up. 
> > > If it can't lock the pid file, it assumes another interchange process is
> > > running.  What I suspect is that interchange locks the pid file before
> > > it forks. Since on solaris, locks created with flock() aren't inherited
> > > across forks.   As a result, when the parent process exits the pid file
> > > becomes unlocked.  When interchange is then run with the shutdown
> > > command it detects that the pid file unlocked and thinks that there
> > > isn't a running interchange process.
> > > 
> > > What I have done is verify that the default install of interchange on my
> > > solaris box uses the flock() function to lock the pid file.  I've also
> > > created a mini perl program that just locks files based off the code in
> > > interchange.  The file locking works fine until I throw a fork() in
> > > it...
> > > 
> > > Anyway, I hope this helps a bit.  
> > 
> > Turns out the files lock fine, but LOCK_NB is not working, at least on the
> > solaris server I tested on (thanks Dorothy). It doesn't work no matter the
> > state of fork. In any case, grab_pid happens in the context of the last
> > fork, as I thought.
> > 
> > I could add a -badlock option at the commandline, but it would seem
> > to make sense to just fix Perl on the affected systems. 
> 
> Looks like we were both wrong...  I was able to track the problem down
> to the read_pid() that was being called from the server_start_message()
> in lib/Vend/Server.pm.  It appears that read_pid() is releasing the lock
> when it opens and closes the pid file.
> 
> Anyway, here is a simple patch that fixes the problem on Solaris:
> 
> diff -rNu ../interchange-4.9.7-200301240658-orig/lib/Vend/Server.pm ./lib/Vend/Server.pm
> --- ../interchange-4.9.7-200301240658-orig/lib/Vend/Server.pm   Fri Dec 13 07:11:35 2002
> +++ ./lib/Vend/Server.pm        Thu Feb  6 10:40:21 2003
> @@ -1200,7 +1200,7 @@
>         push (@types, 'SOAP') if $Global::SOAP;
>         push (@types, 'mod_perl') if $Global::mod_perl;
>         my $server_type = join(" and ", @types);
> -       my $pid = read_pidfile();
> +       my $pid = $$;
>         my @args = $reverse ? ($server_type, $pid) : ($pid, $server_type);
>         return ::errmsg ($fmt , @args );
>  }
> 

No, I don't believe that is any good. We can't be sure it is the same
process in all situations. It is still a bug in Perl or Solaris....there
is no way that an opening and reading a file with a completely different
file handle should release the lock.

I could go for this:

	my $pid = $Global::Variable->{MV_BAD_LOCK} ? $$ : read_pidfile();

At that point, you just need to set 

	Variable MV_BAD_LOCK 1 

in interchange.cfg.

I have made that change in CVS.

-- 
Mike Heins
Perusion -- Expert Interchange Consulting    http://www.perusion.com/
phone +1.513.523.7621      <mike@perusion.com>

Being against torture ought to be sort of a bipartisan thing.
-- Karl Lehenbauer