[interchange] Add new full-page caching features

Jon Jensen interchange-cvs at icdevgroup.org
Thu Oct 31 19:30:26 UTC 2013


commit 2d76b919f9edb0bc17dc2aed6dd7d6b55f290928
Author: Jon Jensen <jon at endpoint.com>
Date:   Thu Oct 31 20:14:05 2013 +0100

    Add new full-page caching features
    
    These features make it much easier to emit pages that are fully cacheable
    and can be served to any user, along with the HTTP Cache-Control response
    header to inform intermediate proxies and browser caches about the
    cacheability and cache lifetime.
    
    To start using the new features:
    
    Add to catalog.cfg:
    
    SuppressCachedCookies yes
    
    Add to pages or templates you'd like to cache (for 2 hours, in this
    example), maybe starting just with pages/index.html:
    
    [if-not-volatile][tag pragma cache_control]max-age=7200[/tag][/if-not-volatile]
    
    That's all you really need for a basic setup.
    
    If you'd like to have a cookie contain the current cart or login
    state, you can write a catalog Sub or a GlobalSub (named, say,
    cookie_state_update) and have it run after every relevant event. Add
    to catalog.cfg:
    
    OutputCookieHook cookie_state_update
    
    A detailed write-up of all these features is here:
    
    http://blog.endpoint.com/2013/10/full-page-caching-in-interchange-5.html
    
    Mark Johnson's talk at the Ecommerce Innovation 2013 conference is
    reported here in slides and video:
    
    http://www.icdevgroup.org/slides/eic2013/full_page_caching.pdf
    http://www.youtube.com/watch?v=n_FbzT_g_RM
    
    Documentation in the xmldocs system is still needed.
    
    Modified squashed commit of the following:
    
    commit ec7147fdc8126e59df482911c7d242221e8b7720
    Author: David Christensen <david at endpoint.com>
    Date:   Tue May 7 14:44:12 2013 -0500
    
        add if_not_volatile usertag to conditionally return data if the current request is not Volatile
    
    commit b02a76d058362f805585c811203572ddc3b75f07
    Author: David Christensen <david at endpoint.com>
    Date:   Fri Apr 26 16:12:22 2013 -0500
    
        Suppress cookies for 400-level errors
    
    commit a2e646528c5b6a1945229ed62bb77d67af24b10b
    Author: David Christensen <david at endpoint.com>
    Date:   Wed Jun 8 14:48:27 2011 -0500
    
        clear the cookie jar when suppressing cookies
    
    commit c54f02b2c4aa619821a9009e3922f7147d60d76f
    Author: David Christensen <david at endpoint.com>
    Date:   Fri Apr 26 16:09:59 2013 -0500
    
        Mark specific routines as always Volatile
    
    commit c0161345066a2f042f808037cf9391a1dbbfff89
    Author: David Christensen <david at endpoint.com>
    Date:   Thu Jun 2 00:00:02 2011 -0500
    
        suppress session write when cookies are suppressed
    
    commit 487511ce0627cd96e3469148603f82ce2a09d970
    Author: David Christensen <david at endpoint.com>
    Date:   Fri Apr 26 16:04:53 2013 -0500
    
        add an OutputCookieHook sub to be run when cookies are being created
    
    commit f0d10bd7b7f4d477c8d71c56277c13b65ced2e35
    Author: David Christensen <david at endpoint.com>
    Date:   Mon Apr 4 15:30:51 2011 -0500
    
        Do not set cookies on pages marked as cacheable
    
        This applies to all cookies currently, with particular attention placed on the MV_SESSION_ID.  This
        change is intended to alleviate the "first page hit" cache issue which leads to IC assigning a session
        id even for pages which would be fully cacheable and then never actually generating the cacheable
        content in question until a subsequent hit from a UA with the provided MV_SESSION_ID.
    
        This is of particular concern when considering the case of a DDoS in which a huge number of connections
        are made from unique instances.  Since the intent of the nginx cache is to prevent such traffic from
        ever reaching the backend, the existing behavior would result in a counter-intuitive effect of all bot
        traffic being passed directly to the "protected" backend when there was no document already in the nginx
        cache, with the result that each backend-handled request would be itself uncacheable, having a Set-Cookie
        header returned with the response.
    
        Currently this fix will affect all cookies being generated from interchange on a page which is marked
        cacheable (specifically one which has a Pragma cache_control value set to something other than no-cache).
        We consider this acceptable as we are also moving the setting of the existing cookies to client-side code
        where possible; if this ends up being untenable, we will see about changing this specific code to just
        affect the MV_SESSION_ID cookie.

 code/UserTag/if_not_volatile.tag |   10 +++++++
 lib/Vend/Config.pm               |    6 +++-
 lib/Vend/Dispatch.pm             |    5 ++-
 lib/Vend/Order.pm                |    6 +++-
 lib/Vend/Server.pm               |   50 +++++++++++++++++++++++++++++++++++--
 lib/Vend/Session.pm              |   12 ++++++---
 6 files changed, 76 insertions(+), 13 deletions(-)
---
diff --git a/code/UserTag/if_not_volatile.tag b/code/UserTag/if_not_volatile.tag
new file mode 100644
index 0000000..8e2e034
--- /dev/null
+++ b/code/UserTag/if_not_volatile.tag
@@ -0,0 +1,10 @@
+UserTag if_not_volatile HasEndTag 1
+UserTag if_not_volatile Interpolate 0
+UserTag if_not_volatile NoReparse 0
+UserTag if_not_volatile Routine <<EOF
+sub {
+    my $body = shift;
+    return $body unless $::Instance->{Volatile};
+    return '';
+}
+EOF
diff --git a/lib/Vend/Config.pm b/lib/Vend/Config.pm
index 72eae5f..3a0c10c 100644
--- a/lib/Vend/Config.pm
+++ b/lib/Vend/Config.pm
@@ -1,6 +1,6 @@
 # Vend::Config - Configure Interchange
 #
-# Copyright (C) 2002-2011 Interchange Development Group
+# Copyright (C) 2002-2013 Interchange Development Group
 # Copyright (C) 1996-2002 Red Hat, Inc.
 #
 # This program was originally based on Vend 0.2 and 0.3
@@ -53,7 +53,7 @@ use Vend::Data;
 use Vend::Cron;
 use Vend::CharSet ();
 
-$VERSION = '2.247';
+$VERSION = '2.248';
 
 my %CDname;
 my %CPname;
@@ -722,6 +722,8 @@ sub catalog_directives {
 	['SessionHashLevels', 'integer',         2],
 	['SourcePriority', 'array_complete', 'mv_pc mv_source'],
 	['SourceCookie', sub { &parse_ordered_attributes(@_, [qw(name expire domain path secure)]) }, '' ],
+	['SuppressCachedCookies', 'yesno',       'no'],
+	['OutputCookieHook', undef,              ''],
 
 	];
 
diff --git a/lib/Vend/Dispatch.pm b/lib/Vend/Dispatch.pm
index d944803..5df06ff 100644
--- a/lib/Vend/Dispatch.pm
+++ b/lib/Vend/Dispatch.pm
@@ -1,6 +1,6 @@
 # Vend::Dispatch - Handle Interchange page requests
 #
-# Copyright (C) 2002-2009 Interchange Development Group
+# Copyright (C) 2002-2013 Interchange Development Group
 # Copyright (C) 2002 Mike Heins <mike at perusion.net>
 #
 # This program was originally based on Vend 0.2 and 0.3
@@ -24,7 +24,7 @@
 package Vend::Dispatch;
 
 use vars qw($VERSION);
-$VERSION = '1.113';
+$VERSION = '1.114';
 
 use POSIX qw(strftime);
 use Vend::Util;
@@ -581,6 +581,7 @@ $form_action{go} = $form_action{return};
 # Process the completed order or search page.
 
 sub do_process {
+	$::Instance->{Volatile} = 1 if ! defined $::Instance->{Volatile}; # Allow non-volatility if previously defined
 
 	# Prevent using keys operation more than once
     my @cgikeys = keys %CGI::values;
diff --git a/lib/Vend/Order.pm b/lib/Vend/Order.pm
index 4f9f0a4..b002787 100644
--- a/lib/Vend/Order.pm
+++ b/lib/Vend/Order.pm
@@ -1,6 +1,6 @@
 # Vend::Order - Interchange order routing routines
 #
-# Copyright (C) 2002-2009 Interchange Development Group
+# Copyright (C) 2002-2013 Interchange Development Group
 # Copyright (C) 1996-2002 Red Hat, Inc.
 #
 # This program was originally based on Vend 0.2 and 0.3
@@ -24,7 +24,7 @@
 package Vend::Order;
 require Exporter;
 
-$VERSION = '2.109';
+$VERSION = '2.110';
 
 @ISA = qw(Exporter);
 
@@ -2012,6 +2012,8 @@ sub route_order {
 
 # Order an item
 sub do_order {
+	$::Instance->{Volatile} = 1 if ! defined $::Instance->{Volatile}; # Allow non-volatility if previously defined
+
     my($path) = @_;
 	my $code        = $CGI::values{mv_arg};
 #::logDebug("do_order: path=$path");
diff --git a/lib/Vend/Server.pm b/lib/Vend/Server.pm
index 3a8437e..575400b 100644
--- a/lib/Vend/Server.pm
+++ b/lib/Vend/Server.pm
@@ -1,6 +1,6 @@
 # Vend::Server - Listen for Interchange CGI requests as a background server
 #
-# Copyright (C) 2002-2009 Interchange Development Group
+# Copyright (C) 2002-2013 Interchange Development Group
 # Copyright (C) 1996-2002 Red Hat, Inc.
 #
 # This program was originally based on Vend 0.2 and 0.3
@@ -24,7 +24,7 @@
 package Vend::Server;
 
 use vars qw($VERSION);
-$VERSION = '2.106';
+$VERSION = '2.107';
 
 use Cwd;
 use POSIX qw(setsid strftime);
@@ -314,6 +314,7 @@ sub parse_cgi {
 
 	my $request_method = "\U$CGI::request_method";
 	if ($request_method eq 'POST') {
+        $::Instance->{Volatile} = 1;
 #::logDebug("content type header: " . $CGI::content_type);
 		## check for valid content type
 		if ($CGI::content_type =~ m{^(?:multipart/form-data|application/x-www-form-urlencoded|application/xml|application/json)\b}i) {
@@ -522,7 +523,20 @@ sub parse_multipart {
 sub create_cookie {
 	my($domain,$path) = @_;
 	my  $out;
-	return '' if $Vend::tmp_session;
+
+	if ($Vend::suppress_cookies) {
+#::logDebug('explicitly clearing the cookie jar (nom nom nom)');
+		undef $::Instance->{Cookies};
+	}
+
+	return '' if $Vend::tmp_session || $Vend::suppress_cookies;
+
+	if (my $sub = $Vend::Cfg->{Sub}{$Vend::Cfg->{OutputCookieHook}}
+				  || $Global::GlobalSub->{$Vend::Cfg->{OutputCookieHook}}
+	) {
+		$sub->();
+	}
+
 	my @jar;
 	push @jar, [
 				($::Instance->{CookieName} || 'MV_SESSION_ID'),
@@ -721,13 +735,43 @@ sub respond {
 		else { print $fh "HTTP/1.0 $status\r\n"; }
 	}
 
+	# Here we decide if we are going to suppress cookie output for the
+	# page; note that this is more-or-less equivalent to saying that
+	# this content is cacheable, and thus we expect (and enforce) that
+	# the effect of hitting this page both with and without a session
+	# (i.e., cache miss or cache hit).  We enforce this by ensuring
+	# that a cacheable page does not set cookies (even if it tries),
+	# and by additionally preventing a session write.
+
+	# The rationale here is that since a user with a session who
+	# fetches from the cache would not have their session altered at
+	# all, we should ensure that the same (lack of) effect will befall
+	# the user who happens to hit the page itself.
+
+	# We ensure that POSTs are never suppressed (i.e., cacheable), and
+	# we also allow this option to be configured per catalog, as not
+	# every catalog may be be setup to properly handle these
+	# assumptions and affects.
+
+	$Vend::suppress_cookies =
+		$CGI::request_method !~ /POST/i &&
+		$Vend::Cfg->{SuppressCachedCookies} &&
+		(
+			(defined $::Pragma->{cache_control} && ($::Pragma->{cache_control} !~ /no-cache/i)) ||
+			($Vend::StatusLine =~ /^Cache-Control:\s+(?!no-cache)\s*$/im)
+		)
+	;
+
 	if ( ! $Vend::tmp_session
 		and (
 			! $Vend::CookieID && ! $::Instance->{CookiesSet}
 			or defined $Vend::Expire
 			or defined $::Instance->{Cookies}
+			or $Vend::Cfg->{OutputCookieHook}
 		  )
 			and $Vend::Cfg->{Cookies}
+			and !$Vend::suppress_cookies
+			and $status !~ /^4\d\d/
 		)
 	{
 		my @domains;
diff --git a/lib/Vend/Session.pm b/lib/Vend/Session.pm
index 96ca780..a70387a 100644
--- a/lib/Vend/Session.pm
+++ b/lib/Vend/Session.pm
@@ -1,8 +1,6 @@
 # Vend::Session - Interchange session routines
 #
-# $Id: Session.pm,v 2.31 2007-08-20 18:29:10 kwalsh Exp $
-# 
-# Copyright (C) 2002-2007 Interchange Development Group
+# Copyright (C) 2002-2013 Interchange Development Group
 # Copyright (C) 1996-2002 Red Hat, Inc.
 #
 # This program was originally based on Vend 0.2 and 0.3
@@ -27,7 +25,7 @@ package Vend::Session;
 require Exporter;
 
 use vars qw($VERSION);
-$VERSION = substr(q$Revision: 2.31 $, 10);
+$VERSION = '2.32';
 
 @ISA = qw(Exporter);
 
@@ -348,6 +346,12 @@ sub close_session {
 }
 
 sub write_session {
+
+    if ($Vend::suppress_cookies) {
+#::logDebug('Skipping session write on cacheable resource');
+        return $Vend::Session->{'user'};
+    }
+
     my($s);
 #::logDebug ("write session id=$Vend::SessionID  name=$Vend::SessionName\n");
 	my $time = time;



More information about the interchange-cvs mailing list