[ic] search engine indexing scan/ MM=0f73bb47ac44f4e422.....
Jon Jensen
jon at endpoint.com
Fri Jun 16 18:48:09 EDT 2006
On Fri, 16 Jun 2006, Jon wrote:
>> I just noticed that Google is reindexing our site after the upgrade to IC 5.4
>>
>> Among the normal results are some of these:
>>
>> www.mrlock.com/eshop/locks/scan/
>> MM=0f73bb47ac44f4e422fab7057f73d0c0:250:299:50.html?mv_more_ip=1&mv_n...
>> - 58k -
>> <http://64.233.187.104/search?q=cache:1wYAQPUy_g4J:www.mrlock.com/eshop/locks/scan/MM%3D0f73bb47ac44f4e422fab7057f73d0c0:250:299:50.html%3Fmv_more_ip%3D1%26mv_nextpage%3Dresults%26mv_arg%3D+cat+60+lock&hl=en&gl=us&ct=clnk&cd=3-
>> <http://www.google.com//search?hl=en&lr=&q=related:www.mrlock.com/eshop/locks/scan/MM%3D0f73bb47ac44f4e422fab7057f73d0c0:250:299:50.html%3Fmv_more_ip%3D1%26mv_nextpage%3Dresults%26mv_arg%3D>Similar
>> pages
>>
>> If I click on the link on the google site - it returns nothing, but
>> if I click on there cached page it does show the result the spider
>> obtained originally.
>
> What that appears to be are the Timed built pages I think. I see the
> same on my site when there is the Page forward/back via the [more-list]
> tag... some magic under there some where. When google crawls and picks
> up those pages they exist but when you click the links in the future the
> page is gone because, I assume, it has expired and needs to be created
> again on the fly. How to circumvent this in particular for google I do
> not know but wish I did since I've got the same problem. I think this
> has been discussed and explained some time ago but I've not been able to
> find it in the archives.
I haven't done this before, but it should work:
Set up your RobotUA etc. to detect GoogleBot (as is on by default). That
sets CGI mv_tmp_session when a robot is the user.
On the page where you're using [more-list], set the matchlimit to a very
big number, so that all the results fit on one page, e.g. ml=10000. Then
when a search engine indexes the page, it will get all the content at
once, and no more-list pages that won't work later.
Jon
--
Jon Jensen
End Point Corporation
http://www.endpoint.com/
More information about the interchange-users
mailing list