CDATA Output Sanitization not working
-
Hello,
I need to create a product feed for Trovaprezzi in XML. The Merchant requires all the verbose tags to be encoded as XML entities, using the CDATA attribute.
Despite the fact I have correctly selected CDATA as the Output Sanitization, in the resulting XML feed all the special character are encoded as HTML entities, including the “<” and “>” of the CDATA attribute itself!
I have included the link to the XML feed above.
Can you please help me?
Thanks,
SalvatoreThe page I need help with: [log in to see the link]
-
Hi, we are looking into it right away.
Also, can you send us a copy of the instruction and a sample template of the Trovaprezzi feed? Then we can compare and improvise solutions accordingly.
Thanks.
Hi Sultan,
thanks for getting back.
Unfortunately, Trovaprezzi does not provide a sample feed and the technical requiraments are only in Italian.
However, the issue is not with the feed format, but with the CDATA attribute converting the text into HTML entities (while it should not).This is the output I get when I select CDATA as the Output Sanitization:
<link><![CDATA[%3c!%5bCDATA%20%5bhttps://www.joiejolie.it/gioielli/anello-trilogy-diamanti/?attribute_pa_lega-metallo=argento-925&attribute_pa_colore-metallo=bianco&attribute_pa_pietre=zircone-bianco%5d%5d%3e]]></link>
<brand><![CDATA [JOIE JOLIE]]></brand>
<description><![CDATA [Disponibile in Argento 925 con zirconi e in Oro 18Kt/750 con zirconi o diamanti.
Pietre num. 3 – Pietra centrale 0,17 carati – Pietre laterali 0,05 carati cad. Tot. 0,27 carati
Caratteristiche Diamante: G/HColor – Taglio Brillante
Il gioiello è corredato di scatola e certificato di garanzia.]]></description>And this is, instead, how it should look:
<link><![CDATA[https://www.joiejolie.it/gioielli/anello-trilogy-diamanti/?attribute_pa_lega-metallo=argento-925&attribute_pa_colore-metallo=bianco&attribute_pa_pietre=zircone-bianco]]></link>
<brand><![CDATA [JOIE JOLIE]]></brand>
<description><![CDATA [Disponibile in Argento 925 con zirconi e in Oro 18Kt/750 con zirconi o diamanti.
Pietre num. 3 – Pietra centrale 0,17 carati – Pietre laterali 0,05 carati cad. Tot. 0,27 carati
Caratteristiche Diamante: G/HColor – Taglio Brillante
Il gioiello è corredato di scatola e certificato di garanzia.]]></description>I reckon (and hope) it’s a quick fix.
Thanks,
SalvatoreHi Sultan, do you have any answers for me?
Hi @joiejolie , sorry for the delayed response. We are working on a fix as soon as possible.
Please give us a couple of days.
Hope you understand.
Thanks for being patient.
Because of you. I’m sure you will find the solution.
Good job. I’m waiting.Hi @joiejolie , we have now fixed the CDATA sanitization in the latest update.
By default, there should be no CDATA added for any attribute (except for Google Shopping and Facebook Ads feed).
So for Trovaprezzi, if you generate the feed, then there will be no CDATA.
If you need to add CDATA, then you have to use the output sanitization for it.
So update the plugin and test it out.
Do let us know if your issue is solved.
Thanks.
Hi @sultan00rextheme,
there has been an improvement, but the bug is not completely fixed yet.
Let’s start with the fact that I MUST use the CDATA as some fields (Product Name, Description, for example) have special characters, like accented vowels or apostrophes that, if I don’t CDATA them are treated as HTML entities and are encoded.
As for your comment “If you need to add CDATA, then you have to use the output sanitization for it.”, that is exactly what I have been trying to do and that’s when I discovered the bug.
I am sharing 4 screenshots via DropBox:
https://www.dropbox.com/sh/kkpusb783utv599/AACCuJ0yblokvyJyyAzjvetPa?dl=0showing the configuration and the respective XML output using and not using the CDATA sanitization.
As you can see:
when I don’t use the CDATA, the special chars are HTML encoded always (as expected)
when I use the CDATA, not only the special chars are still HTML encoded, but also the <>[] chars of the tag itself are HTML encoded!Can you please advice?
Thanks
SalvatoreHi, can you send us these screenshots via email to [email protected]?
Then we can give faster support.
Also include the feed link for the feed where this error is happening.
Thanks.
Sent
Hi @joiejolie , we checked the feed links. If you open the XML files directly on your browser, preferably Google Chrome, you will see the format is correct.
What I believe is you opened the file with a code editor.
?
?Most code editors, when you will try to edit XML, will convert special characters into HTML, but it’s different when you open it with an XML viewer which does not alter the format.Also, after you download the XML file, do not open it with notepad or without a proper XML viewer, else it will break its format. (You can use Sublime Text for viewing XML files on PC other than the browser.)
If you are generating this feed for a particular market place, then try submitting the XML file you downloaded without editing it. And get a feedback from them. If they really accept XML, then the one generated with CDATA should not have their special characters changed.
?
?If they do get changed to HTML after you submit the feed to the market place in question, then please collect their instructions on feed data specifications and give it to us. We will find out what is going wrong.
?
?Do let us know. Thanks.Hi Sultan,
I’m nout sure why you suggest to use an XML viewer that, in fact, applies an active decoding, defeating the purpose of using CDATA to treat the text as XML entities instead of HTML entities.
I use Notepad++ to view the files (which treats the stream as pure text, as it should) and I can see the text is encoded in HTML entities.
I submitted the feed to the merchant already and the complaint come from them. They can’t upload the feed as it contains HTML entities.Let me put it differently:
If my product description contains the following text:
Anello Fascia Riviera realizzato in oro bianco 9kt,
con 12 Diamanti naturali Tot. 0.12 carati, Misura 13.
Corredato di scatola gioielleria e certificato di garanzia.In my XML, using CDATA, this is what I am expecting:
<description><![CDATA[Anello Fascia Riviera realizzato in oro bianco 9kt, con 12 Diamanti naturali Tot. 0.12 carati, Misura 13. Corredato di scatola gioielleria e certificato di garanzia.]]></description>
and, instead, what I get from your plug in is:
<description><![CDATA[Anello Fascia Riviera realizzato in oro bianco 9kt,
con 12 Diamanti naturali Tot. 0.12 carati, Misura 13.
Corredato di scatola gioielleria e certificato di garanzia.]]></description>Can you fix this or not?
Thanks,
Salvatoresorry: and, instead, what I get from your plug in is:
<description><![CDATA[Anello Fascia Riviera realizzato in oro bianco 9kt,
con 12 Diamanti naturali Tot. 0.12 carati, Misura 13.
Corredato di scatola gioielleria e certificato di garanzia.]]></description>and, instead, what I get from your plug in is:
>description><![CDATA[Anello Fascia Riviera realizzato in oro bianco 9kt,
con 12 Diamanti naturali Tot. 0.12 carati, Misura 13.
Corredato di scatola gioielleria e certificato di garanzia.]]></description>and, instead, what I get from your plug in is:
<description><!
[CDATA[Anello Fascia Riviera realizzato in oro bianco 9kt,
con 12 Diamanti naturali Tot. 0.12 carati, Misura 13.
Corredato di scatola gioielleria e certificato di garanzia.]]
></description>
Hi @joiejolie , we sent you a copy of the plugin with the fix. Please let us know if that solves the issue. Thanks.
- The topic ‘CDATA Output Sanitization not working’ is closed to new replies.