[ic] /special_pages/missing.html

Frank Reitzenstein frank at goldissue.com
Tue May 22 10:00:51 EDT 2007


Hello Carl,

I found time to try out your suggestion. I added these two lines to the
very top of /special_pages/missing.html:

[tag op=header]Status: 404 missing[/tag]
[tag op=header]Content-type: text/html[/tag]

and sure enough when I tried to visit the page /jjjjjjjjjjjjjjjj on my
shopping cart the apache log showed as follows:

192.168.1.2 - - [22/May/2007:21:47:45 +0800] "GET /jjjjjjjjjjjjjjjj
HTTP/1.1" 404 7242 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.7.8) Gecko/20050511"
192.168.1.2 - - [22/May/2007:21:47:45 +0800] "GET
/images/foundation/topleft.jpg HTTP/1.1" 304 -
"http://www.aussievitamin.com/jjjjjjjjjjjjjjjj" "Mozilla/5.0 (Windows;
U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511"
192.168.1.2 - - [22/May/2007:21:47:45 +0800] "GET
/images/foundation/ssl.gif HTTP/1.1" 304 -
"http://www.aussievitamin.com/jjjjjjjjjjjjjjjj" "Mozilla/5.0 (Windows;
U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511"
192.168.1.2 - - [22/May/2007:21:47:45 +0800] "GET
/images/foundation/clear.gif HTTP/1.1" 304 -
"http://www.aussievitamin.com/jjjjjjjjjjjjjjjj" "Mozilla/5.0 (Windows;
U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511"
192.168.1.2 - - [22/May/2007:21:47:45 +0800] "GET
/images/foundation/go.gif HTTP/1.1" 304 -
"http://www.aussievitamin.com/jjjjjjjjjjjjjjjj" "Mozilla/5.0 (Windows;
U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511"
...and the missing page displays as it should.

I would be interested to hear what others have to say about this. At
first there appears to be an advantage in not letting the spiders know
that a page is missing. It is said that google punishes sites with
broken links.

However as the complexity of what I am doing increases I am finding that
google is spidering pages which no longer exist, and with "supplemental
results" they are piling up in there. Of course if you have never
renamed or deleted a page on your site this doesn't really apply.

I intend to use this form of missing.html from now on,

Regards,

Frank Reitzenstein




Carl Bailey wrote:

> On May 20, 2007, at 12:01 PM, Frank Reitzenstein wrote:
>
>> Subject: [ic] /special_pages/missing.html
>> To: interchange-users at icdevgroup.org
>>
>> Hello,
>>
>> I am running Interchange 5.2 on Fedora Core 5
>>
>> I have searched for posts regarding missing pages because I am coming to
>> the conclusion if the server doesn't return 404 for a page which is no
>> longer on the site, then the search engines go on indexing many more
>> pages than are visible on the site, for a very long time.
>>
>> We are told that Google Page Rank gets distributed amongst our pages,
>> and hence in this case to pages which don't exist. The danger is with PR
>> being so much diluted, that the vast majority of pages end up in
>> "Supplemental Results".
>>
>> My page /special_pages/missing/html now consists of only:
>>
>> [tag op=header]Status: 404 missing[/tag]
>>
>> and I am pleased that finally the search engines will record that the
>> missing pages no longer exist. It will also help me when I use the new
>> Google webmaster tool to remove pages. However at the moment the user
>> sees a blank page instead of a standard 404 message.
>> /
>> /I notice that a few people have discussed this, and some have found a
>> solution. I don't use /Mod::Interchange/ that I am aware. From memory
>> there used to be only a private version for Debian users.
>>
>> I could spend hours trying to hack something. I am hoping that someone
>> has an easy answer ;)
>>
>> Regards,
>>
>> Frank Reitzenstein
>
>
>
> We tried this out, and found that we could get the "normal" missing
> page to display if we also added:
>
> [tag op=header]Content-type: text/html[/tag]
>
> Then apache would serve the page with the correct status code and the
> page displayed exactly as expected.  Before adding the content-type,
> some browsers wanted to download the file returned by the server and
> save it to disk because they received a content type of
> application/x-executable.  The only thing unexpected was that the 404
> does not show up in the apache error log file, only in the access
> log.  This makes sense insofar as Apache successfully finds and
> returns a page.  The 404 is not truly an apache error, hence no entry
> in the log.  This might confuse some log analyzer statistics packages.
>
> We are using IC 5.4.1 with Apache 1.3.33 and no mod_interchange.
>
> Regards,
> Carl
> + - - - - - - - - - - - - - +
> | Carl Bailey
> | Triangle Research, Inc.
> + - - - - - - - - - - - - - +
>
> _______________________________________________
> interchange-users mailing list
> interchange-users at icdevgroup.org
> http://www.icdevgroup.org/mailman/listinfo/interchange-users
>


More information about the interchange-users mailing list