[ic] [interchange] Refactor robot testing into an internal routine

David Christensen david at endpoint.com
Wed Jan 18 04:35:01 UTC 2017


> On Jan 17, 2017, at 10:04 PM, Peter <peter at pajamian.dhs.org> wrote:
> 
> On 18/01/17 11:18, David Christensen wrote:
>> +	$Vend::Robot = check_is_robot();
>> +
>> +	$CGI::values{mv_tmp_session} ||= 1 if $Vend::Robot;
>> +}
>> +
>> +
>> +sub check_is_robot {
>> +    my $ret = 1;
>> +
>> #::logDebug("Check robot UA=$Global::RobotUA IP=$Global::RobotIP");
>> 	if ($Global::RobotIP and $CGI::remote_addr =~ $Global::RobotIP) {
>> #::logDebug("It is a robot by IP!");
>> -		$Vend::Robot = 1;
>> +		$ret = 1;
>> 	}
>> 	elsif ($Global::HostnameLookups && $Global::RobotHost) {
>> 		if (!$CGI::remote_host && $CGI::remote_addr) {
>> @@ -291,7 +300,7 @@ EOF
>> 		}
>> 		if ($CGI::remote_host && $CGI::remote_host =~ $Global::RobotHost) {
>> #::logDebug("It is a robot by host!");
>> -			$Vend::Robot = 1;
>> +			$ret = 1;
>> 		}
>> 	}
>> 	unless ($Vend::Robot) { 
>> @@ -300,11 +309,10 @@ EOF
>> 		}
>> 		elsif ($Global::RobotUA and $CGI::useragent =~ $Global::RobotUA) {
>> #::logDebug("It is a robot by UA!");
>> -			$Vend::Robot = 1;
>> +			$ret = 1;
>> 		}
>> 	}
>> -
>> -	$CGI::values{mv_tmp_session} ||= 1 if $Vend::Robot;
>> +	return $ret;
>> }
> 
> I see some issues with this:
> 
> 1.  check_is_robot() will always return 1, I think you meant to
> initialize $ret to 0, not 1 at the top.
> 
> 2.  The previous code would not explicitly set $Vend::Robot to 0, only
> to 1, so there is an incompatible case where $Vend::Robot is already set
> to 1 here but check_is_robot() would return 0.  You can fix this by
> changing:
> 
> $Vend::Robot = check_is_robot();
> 
> ...to...
> 
> $Vend::Robot ||= check_is_robot();

Thanks for the feedback; I do know it’s a little wonky currently, thanks to some locally failing unit tests (yay!).

I tested a naïve implementation of the above suggestions and didn’t get things working as expected.  I also noticed an additional place where I referenced $Vend::Robot *inside* this block, which was unintentional.  I am going to revisit with a clear head tomorrow.

The intention is for check_is_robot() to just examine the relevant UA, etc, only and return a boolean determination, so $Vend::Robot should be =, not ||=; the only other places that refer to $Vend::Robot check for the flag, not having anything to do with it; the ||= comes into play when considering whether to set the mv_tmp_session, which we do respect the previous setting of.

Thanks again,

David
--
David Christensen
End Point Corporation
david at endpoint.com
785-727-1171






More information about the interchange-users mailing list