some characters escaped in xml, others not?

lauren.fraser's Avatar

lauren.fraser

30 Mar, 2015 05:05 PM

Any idea why some of the special characters entered in a text field are escaped in the xml and others are not?

entered in text field in data definition:

<a href ="http://citizensmemorial.adam.com/content.aspx?productId=117&isArticleLink=false&pid=60&gid=000759">Evaluation & treatment of cardiovascular disease</a>

xml:

<interest>&lt;a href ="http://citizensmemorial.adam.com/content.aspx?productId=117&amp;isArticleLink=false&amp;pid=60&amp;gid=000759">Evaluation &amp; treatment of cardiovascular disease&lt;/a></interest>
  1. Support Staff 1 Posted by Tim on 30 Mar, 2015 05:38 PM

    Tim's Avatar

    Hi Lauren,

    Can you tell me how you are outputting the value you pasted for the XML? For example, is this coming from an XSLT Format or is it from a Velocity Format?

    The XML syntax requires that the following characters always be escaped:

    &
    <
    >
    "
    '
    
    So, if the data you entered into a text field is being indexed by an Index Block, for example, the Index Block must escape the entire field to make sure that the Block is valid XML. I believe an XSLT Format would also automatically escape the entire value here because XSL must contain valid XML as well.
  2. 2 Posted by lauren.fraser on 31 Mar, 2015 10:08 AM

    lauren.fraser's Avatar

    Hi Tim,
    I don't have a format applied, just the default xml template:

    <xml> 
    <system-region name= "DEFAULT" />
    </xml>
    

    The data is currently just being entered into a data definition, that is applied to a page (we're using web services to update the page).

    Is there a simple "default" velocity (or xslt) format that could be applied to all configuration sets?

  3. Support Staff 3 Posted by Tim on 01 Apr, 2015 02:47 PM

    Tim's Avatar

    Gotcha. In that case that is going to be the expected outcome.

    Is there a simple "default" velocity (or xslt) format that could be applied to all configuration sets?

    Not that I'm aware of. What would you be looking for this particular Format to do?

  4. 4 Posted by lauren.fraser on 01 Apr, 2015 06:41 PM

    lauren.fraser's Avatar

    Tim,

    We have a txt file that we export out of our credentialing software and we’re going to use it to feed the new provider pages on our redesigned site via web services.

     

    We tried using bricks to match the providers areas of interest with links to our health content (so a cardiologist with a treadmill stress test as an area of interest would link to a page with info about treadmill stress testing), but we had too many distinct links that needed to be built.

     

    Ryan suggested using external links and an index block, which is doable, but we wondered about the time/tax of going through over +350 links to find the 25 or less that match for a specific provider. We have an area we can store the links in and export it to our software. I could pull the area of interest text and the link separately, then combine them in the page format, but I was hoping to avoid that and just pass the needed html code to the data def.

     

    I had tested with one area and used a replace function in velocity to convert the escaped characters back to < , >, &, etc. and was able to get it to work. But when I added the complete set for my test provider, the html showed differently (some characters escaped, some not) and I get this error message:

    An error occurred: Could not transform with Script format "_internal/formats/providers/provider-listing-links": Error on line 76: The reference to entity "isArticleLink" must end with the ';' delimiter.

    You may choose to retry the operation <javascript:location.reload(false)> . If the problem persists, please contact a system administrator.

    The error has been logged to the system console.

     

    Here is the xml link sample:

    &lt;a href ="http://citizensmemorial.adam.com/content.aspx?productId=117&amp;isArticleLink=false&amp;pid=60&amp;gid=000759">Evaluation &amp; treatment of cardiovascular disease&lt;/a>

     

    And a link to the xml file we’re reading:

    https://www.citizensmemorial.com/Temp/providerWebDataLinks.xml

    Which doesn’t have the characters escaped in it.

     

    Thanks,

    Lauren

  5. Support Staff 5 Posted by Tim on 01 Apr, 2015 09:07 PM

    Tim's Avatar

    Hey Lauren,

    Thanks for the additional information.

    The error you mentioned:

    An error occurred: Could not transform with Script format "_internal/formats/providers/provider-listing-links": Error on line 76: The reference to entity "isArticleLink" must end with the ';' delimiter.
    
    is because the ampersand wasn't being escaped properly.

    https://www.citizensmemorial.com/Temp/providerWebDataLinks.xml Which doesn’t have the characters escaped in it.

    That link does appear to have the characters escaped in it. It may just be the browser you are using to view it. For example, if you're using Firefox, viewing the XML will show unescaped characters but if you actually view the source code of the frame containing the XML, you'll see escaped characters. For example:

    <PROVIDER_AREAS_OF_INTEREST_USERDEF_M12>&lt;a href="http://citizensmemorial.adam.com/content.aspx?productId=117&amp;isArticleLink=false&amp;pid=1&amp;gid=002323"&gt;Evaluation &amp; treatment of peripheral arterial disease (PAD)&lt;/a&gt;</PROVIDER_AREAS_OF_INTEREST_USERDEF_M12>
    
  6. 6 Posted by lauren.fraser on 02 Apr, 2015 10:11 AM

    lauren.fraser's Avatar

    This is the velocity I tried using for the area of interest:

    $_SerializerTool.serialize($docInterest1, true).replaceAll("&gt;", ">").replaceAll("&lt;", "<").replaceAll("&amp;", "&") <br />

     

    I just saw that the & in the link is being outputted as &amp;amp; inside of Cascade.

    I changed my velocity to the following and it worked:

    $_SerializerTool.serialize($docInterest1, true).replaceAll("&gt;", ">").replaceAll("&lt;", "<").replaceAll("&amp;amp;", "&") <br />

     

    So my only question – is will this always output the same way? It seems weird that it is escaping the & from the original escaped &amp;

    But not the & in &gt; or &lt;.

     

    Thanks,

    Lauren

  7. 7 Posted by Ryan Griffith on 12 May, 2015 06:41 PM

    Ryan Griffith's Avatar

    Hi Lauren,

    Our apologies for not responding sooner to your discussion.

    So my only question – is will this always output the same way? It seems weird that it is escaping the & from the original escaped &

    It sounds to me as though serializing the content may be causing the &amp; to be double-encoded, which would explain why your adjustment is working. Based on what you have, you should be OK since the code will only replace double-encoded ampersands. For example, if your content ends up containing &, the serializer would turn that into &amp;, which won't be replaced and shouldn't throw an error since it's valid.

    Please let me know if you have any questions.

    Thanks!

  8. 8 Posted by lauren.fraser on 18 May, 2015 03:01 PM

    lauren.fraser's Avatar

    Ryan,

    Thanks for following up. It looks like the code we have in place now is working. At this point, I can’t remember if we tested the EscapeTool or not. But it’s working reliably, so we’ll go ahead and stick with it.

     

    Lauren

  9. 9 Posted by Ryan Griffith on 18 May, 2015 03:53 PM

    Ryan Griffith's Avatar

    Thank you for following up, Lauren. We are glad to hear your updates are doing the trick.

    I'm going to go ahead and close this discussion, please feel free to comment or reply to re-open if you have any additional questions.

    Have a great day!

  10. Ryan Griffith closed this discussion on 18 May, 2015 03:53 PM.

Discussions are closed to public comments.
If you need help with Cascade CMS please start a new discussion.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

 

26 Aug, 2016 01:19 PM
25 Aug, 2016 03:02 PM
25 Aug, 2016 12:50 PM
24 Aug, 2016 08:43 PM
24 Aug, 2016 07:20 PM
21 Aug, 2016 01:20 PM