[ic] Froogle.google.com anyone using this yet?

Philip S. Hempel interchange-users@icdevgroup.org
Fri Dec 20 00:31:02 2002


Has anyone gotten involved in using froogle?
http://froogle.google.com

I have an account and am about to create the format
that is in the pdf sent to me from google.

One of the criteria is that the descriptions are
to not have any html code for the data upload.

This posses a problem since many of the descriptions we use
have some form of html and many time exceed the required text lengths
for the descriptions.

Here is the description of the requirements

Basic File Format
  The basic file format has the following required parameters:
  Tab-delimited text file
  First line of the file is the header – must contain field names, all lower-case
  Use the field names from the table below, and in the same column order
  One line per item (use a newline or carriage return to terminate the line)
  File encoding is LATIN1 (ASCII is fine, as it is a subset of LATIN1)
  The following field elements are forbidden as part of the basic format. If you 
want to include them, you must use the extended format. If you accidentally 
include them as part of the basic format, products that contain errors will be 
dropped from the feed.
  Tabs, carriage returns, or newline characters may not be included inside any
field, including the description.
  Exactly one tab must separate each field. If there are extra tabs inserted
between fields in a line, or at the end of a line, that product will be dropped.
  HTML tags, comments, and escape sequences may not be included –
description must be plain text.

I am considering a couple of ways to do this and need some suggestions.

1. Do a sql dump of the fields I need and run a script over the data
to clean out html and other characters not allowed.

2. Work this out through IC and produce the clean descriptions with IC.

I would almost think that it would be easier with IC with it's many ways
of filtering content.

Just one suggestion for the Dev group, If a tag could be built into IC that
would be put into 5.0 supporting Froogle, this would be a great selling point 
for IC.

* "IC is Froogle friendly, auto output of the required data structure for
Googles' Froogle products search engine. Has the ability to use both the
basic format and extended formats used in the Froogle search engine uploads." *

I would like some comments on what direction would be the easiest way to
start. What caveats I could run into with either format.

Suggestions would be appreciated.

-- 
Philip S. Hempel