Menu Content

Shopping Cart

£15.00


1 Product
£15.00



Currencies Accepted


Newsletter

Subscribe to our newsletter


Name:

Email:

Like it? Share it!

Find Us Elsewhere

Follow us on Twitter

Home Support Forum Virtuemart Extensions GoogleBaseXML bug on xml and txt

 Spiral Scripts Support Forum :: Virtuemart Extensions
Welcome Guest   [Register]  [Login]
 Subject :GoogleBaseXML bug on xml and txt.. 04-11-2010 20:58:52 
webmastergreg
Fresher
Joined: 04-11-2010 19:50:45
Posts: 5
Location
Hello

Congrats for this excellent and simple component.

We just have an issue withe the feed (and so with the txt file too) as the stuff inside the feeds are not enclosed in cdata tags.

And html entity are not parsed correctly.

I can send you two distinct feeds one ok (with another extension) and yours, for comparison.

Thanks
IP Logged
 Subject :Re:GoogleBaseXML bug on xml and txt.. 06-11-2010 08:03:06 
boggler
Spiral Scripts Support
Joined: 18-08-2009 10:14:13
Posts: 211
Location
Hi, I don't believe that you are correct that the information should be contained in CDATA tags - see the Google data feed specification at http://www.google.com/support/merchants/bin/answer.py?answer=188494

Could you let me know of a specific example where you are having problems, it sounds as if there is a problem somewhere with the entity encoding, I would think it can be fixed.

Can you give me the url of your product feed? That might help.
IP Logged
susan subway boggler
 Subject :Re:GoogleBaseXML bug on xml and txt.. 06-11-2010 13:41:19 
boggler
Spiral Scripts Support
Joined: 18-08-2009 10:14:13
Posts: 211
Location
To expand upon my previous answer, I am sure that it would be incorrect to enclose the data in cdata tags. The cdata tag is basically an xml version of the html comment tag <!-- a comment -->
- it means that any text inside the tags will be ignored.

It is true that if you enclose any data that contains html in cdata tags then it will prevent validation errors - but this is not how you want to handle html tags in the data.

The correct way to do this is to use html entity encoding, which is what we do. For example, < will become <

This is the method that Google recommend in their documentation, and is the standard for RSS.

You need to remember what the product feed is for - to generate a product listing on Google product search. If you enclose the data in cdata tags, then Google will see this as an empty listing, because that is what the cdata tag implies.

I have to say that if you have an example of a data feed where the data is enclosed in cdata tags then that is incorrectly formatted, not ours.

If you think that there is a problem though with the product feed generated by our component do let me know the URL, I will be able to check then if there is something going wrong with the entity ecoding.
IP Logged
Last Edited On: 06-11-2010 13:43:09 By boggler for the Reason
susan subway boggler
 Subject :Re:GoogleBaseXML bug on xml and txt.. 06-11-2010 17:06:43 
webmastergreg
Fresher
Joined: 04-11-2010 19:50:45
Posts: 5
Location
Hello

thanks for your reply

For the CDATA it's ok, I've just say that just in case, (I should add a quetion mark to this one, sorry)

My real issue is tha html entity.

Look at our feed with your extension:
http://www.materiel-grand-format.fr/index2.php?option=com_googlebasexml&format=xml&product_currency=EUR&product_country=FR

And the feed with another extension for google base (that we don't want to use, yours is better)
http://www.materiel-grand-format.fr/plugins/system/vmrss/all_products.xml

So this is just something wrong with the html entity, all the products description without htlm tags are just fine.

Thanks
IP Logged
Last Edited On: 06-11-2010 17:08:33 By webmastergreg for the Reason
 Subject :Re:GoogleBaseXML bug on xml and txt.. 07-11-2010 18:00:19 
webmastergreg
Fresher
Joined: 04-11-2010 19:50:45
Posts: 5
Location
Hello, I've manage to fix this issue by this way.

the file:
com_googlebasexml/models/googlebasexml.php

I've replace (line 573):
Code:
$items[$i]->description = htmlspecialchars($desc);

By:
Code:
$items[$i]->description = $desc;

However this is not a good fix, because some special caracters are not encoded.

The real issue here is that there's a double encode of the ampersand so the & are wrongly encoded as amp; for each html entities.
But I let you check this more deeper.
IP Logged
Last Edited On: 07-11-2010 19:52:55 By webmastergreg for the Reason
 Subject :Re:GoogleBaseXML bug on xml and txt.. 07-11-2010 20:35:41 
webmastergreg
Fresher
Joined: 04-11-2010 19:50:45
Posts: 5
Location
Ok

I've finally just add this:
Code:
$items[$i]->description = str_replace('&','&',$desc);

(the fisrt one is amp;, because of the forum...)

To replace all the twice encoded ampersand, and now the feed is ok.
I will test that on google merchant.

So htmlspecialchars is perhaps not the perfect way to retreive the description.
Some of preg_replace with array could be ok too, IMUO.

Just tell me what you think
IP Logged
 Subject :Re:GoogleBaseXML bug on xml and txt.. 08-11-2010 10:32:38 
boggler
Spiral Scripts Support
Joined: 18-08-2009 10:14:13
Posts: 211
Location
I know it seems wierd, but actually the double encoding of the '&' is correct! Remember that when it is eventually displayed as a product listing it will be displayed as html, there will be one round of decoding, so

&amp;

will become

&

which will display correctly as an ampersand character when viewed as html.

So there really is no reason to worry about this, although I can well understand why you are confused!

However since you have drawn my attention to it, I have noticed that there is a bug in the character encoding in fact, as

$items[$i]->description = htmlspecialchars($desc);

should be
$items[$i]->description = htmlspecialchars($desc, ENT_QUOTES);

to ensure that the single quotation mark is correctly encoded.

I will make a new release which fixes this problem.
IP Logged
susan subway boggler
 Subject :Re:GoogleBaseXML bug on xml and txt.. 08-11-2010 14:14:53 
boggler
Spiral Scripts Support
Joined: 18-08-2009 10:14:13
Posts: 211
Location
There is a new release, version 1.0.3, which deals with the htmlspecialchars ENT_QUOTES problem mentioned above.

You can update by downloading again using your existing download link, then upload and install using the Joomla installer, no need to uninstall first.

Glad you like the component, we've worked hard to make this a useful product.
IP Logged
Last Edited On: 08-11-2010 14:16:24 By boggler for the Reason
susan subway boggler
 Subject :Re:GoogleBaseXML bug on xml and txt.. 08-11-2010 17:12:17 
webmastergreg
Fresher
Joined: 04-11-2010 19:50:45
Posts: 5
Location
Hi

yes you were right & is ok, just noticed that after submission accepted by GM.
I was obsessed by the firefox feed rendering (so much issue with feeds in the past)

So it's ok I will test the new version, and back to you.

I must say at this time (so with the first version) that I got 21 products from 74, rejected for bad characters.
I tell you more about that after my tests

Thanks a lot
IP Logged
 Subject :Re:GoogleBaseXML bug on xml and txt.. 09-11-2010 11:19:02 
boggler
Spiral Scripts Support
Joined: 18-08-2009 10:14:13
Posts: 211
Location
The bad characters may arise if you have pasted the product description from a word processor where these contain special characters - these can result in invalid html.

Usually they will display OK on a web page as modern web browsers are very forgiving of validation problems, but will cause problems with xml, as the rules for validating xml must be strictly followed. If you are having these problems I think you could try running the offending items through an html validator.
IP Logged
susan subway boggler
Page # 


Powered by ccBoard


 
 

VirtueMart Featured Products Grid

Switch View

This module can be used as a replacement for the Virtuemart featured products, top products, random products or recent products modules. It displays product images as a 3 Dimensional Flash slideshow.

£15.00


*****Now compatible with Joomla 1.6******* A module extension for the the Joomla 1.5 + 1.6 CMS. It displays a short excerpt from articles in a selected category or section, or from a specified list of articles, with link and optional thumbnail image.

£10.00


*****Now compatible with Joomla 1.6******* A Flash puzzle game for the Joomla! content management system. This would be suitable for a site aimed at children. Compatible with Joomla 1.5 and Joomla 1.6.

£10.00


A featured items module that shows selected entries from the SOBI2 business index.

£12.00



 
 

fitness