Not sure how to read part of my XML feed.

webadmin's Avatar

webadmin

11 May, 2015 03:06 PM

I'm getting an XML feed from a wordpress site we own, and I'm having a bit of trouble. We're using the "featured image" they have, and I'm trying to get that link. I'm just not sure how to request it from the feed, since it displays a bit oddly. Here's the code:

<media:thumbnail url="https://nobtsgatekeeper.files.wordpress.com/2015/05/sendnola.jpg?w=150"/>
<media:content medium="image" url="https://nobtsgatekeeper.files.wordpress.com/2015/05/sendnola.jpg?w=150">
    <media:title type="html">sendnola</media:title>
</media:content>
<media:content medium="image" url="https://0.gravatar.com/avatar/6772422f873982faaf6b86c97e75d1e5?s=96&amp;d=identicon&amp;r=G">
    <media:title type="html">nobtslive</media:title>
</media:content>

As an aside, I'm having a bit of trouble getting velocity script to remove an ampersand. I use this:

set ($thePageTitle = $thePageTitle.replace("&", "&amp;"))

But the system still balks at the existence of the ampersand. What's the trick to make it remove the ampersand without freaking out?

  1. 1 Posted by webadmin on 11 May, 2015 03:07 PM

    webadmin's Avatar

    Sorry, forgot to mention, I'd like to be able to read all three of those and be able to tell the difference between them. Thanks.

  2. 2 Posted by Ryan Griffith on 11 May, 2015 06:40 PM

    Ryan Griffith's Avatar

    Hi,

    It sounds like you are having issues with dealing with the media namespace used by the feed. There are a couple of ways you can go about accessing those elements:

    • Loop over all of the children, check the name and then output the attribute value
    • Use XPath to get the element by it's local-name (ie without the namespace) and then output the attribute value. For example:
    #set ($images = $_XPathTool.selectNodes($contentRoot, "//*[local-name() = 'content' and @medium = 'image']"))
    

    As an aside, I'm having a bit of trouble getting velocity script to remove an ampersand. I use this:

    I believe String.replace will only replace the first occurrence. Instead try something like the following:

    #set ($thePageTitle = $thePageTitle.replaceAll("&(?!amp;)", '&amp;'))
    

    The regex should also ensure you are not double encoding ampersands.

    Please let me know if you have any questions.

    Thanks!

  3. 3 Posted by webadmin on 11 May, 2015 08:13 PM

    webadmin's Avatar

    That's still not working on the ampersand replacement. I get this error when I have an ampersand in the display name and title of a page:

    An error occurred: Could not transform with Script format "_site/transformations/Breadcrumbs": Error on line 13: The entity name must immediately follow the '&' in the entity reference.

    Attached is my transformation. I tried stripping the ampersands out of every time I read the name, but it's still not working.

  4. 4 Posted by webadmin on 11 May, 2015 08:52 PM

    webadmin's Avatar

    I tried what you gave me to read the media stuff, and I'm still having trouble. I changed it slightly, because I'm looping over an item that has media tags I listed earlier. I didn't want the // on it, because that makes it give me every media tag in the page. I've changed it to this:

    #set ($images = $_XPathTool.selectNodes($item, "*[local-name() = 'content' and @medium = 'image']"))
    

    This does appear to give me all the media tags of type content in the item. What I don't quite get is why I can't type this and have it work:

    #set ($images = $_XPathTool.selectNodes($item, "/media[local-name() = 'content' and @medium = 'image']"))
    

    or

    #set ($images = $_XPathTool.selectNodes($item, "media[local-name() = 'content' and @medium = 'image']"))
    

    From the documentation I've seen, that ought to give me any tag named "media". Instead, it gives me nothing. Also, I can't seem to differentiate between the two. When I view the value of each node in the $images node, I see the text between the tags inside it ("nobtslive" and "sendnola", in this example). However, I can't do any tests on it. When I try to see if the value is equal to "nobtslive", it comes back false no matter what.

    Also, how do I read the URL out of the tag? I can't figure out the code to grab it.

  5. 5 Posted by Ryan Griffith on 12 May, 2015 03:21 PM

    Ryan Griffith's Avatar

    Hi,

    That's still not working on the ampersand replacement. I get this error when I have an ampersand in the display name and title of a page:

    Curious, but have you tried using the Escape Tool instead to escape entities? Perhaps try adjusting the end of your GetPageName macro to the following:

    #set ($thePageTitle = $_EscapeTool.xml($thePageTitle.value))
    

    Note: we're setting the variable to the .value now instead of the Element itself, so you will want to adjust references later in your script to no longer use $thePageTitle.value.

    This does appear to give me all the media tags of type content in the item. What I don't quite get is why I can't type this and have it work:

    The reason your XPath queries won't work is because media is a namespace, not a node name.

    Ideally, you would use media:content for the node name; however, this will not work because there is currently no way to register custom namespaces for the XPath Tool to use (similar to what you can do with XSLT). Instead, you would receive an error stating you are using an unregistered namespace.

    Also, I can't seem to differentiate between the two. When I view the value of each node in the $images node, I see the text between the tags inside it ("nobtslive" and "sendnola", in this example). However, I can't do any tests on it. When I try to see if the value is equal to "nobtslive", it comes back false no matter what.

    Your best bet would be to check the value of the url attribute as follows:

    #set ($images = $_XPathTool.selectNodes($contentRoot, "//*[local-name() = 'content' and @medium = 'image']"))
    
    #foreach($image in $images)
        #set ($url = $image.getAttribute("url").value)
        ## ... add logic to check $url here ...
    #end
    

    Please let me know if you have any questions.

    Thanks!

  6. 6 Posted by webadmin on 12 May, 2015 09:43 PM

    webadmin's Avatar

    I'm still getting the same problem with the ampersands, but the URL request worked.

  7. 7 Posted by Ryan Griffith on 13 May, 2015 12:54 PM

    Ryan Griffith's Avatar

    Thank you for following up. I am glad to hear you were able to access the URL attribute.

    I'm still getting the same problem with the ampersands

    Did you have a chance to try out the Escape Tool? If so, and that still doesn't work, it sounds like that &amp; in the URL might be double-encoded. Perhaps try the following to see if it helps:

    #set ($thePageTitle = $thePageTitle.replaceAll("&amp;amp;", "&"))
    

    Please let me know if you have any questions.

    Thanks!

  8. 8 Posted by webadmin on 13 May, 2015 01:38 PM

    webadmin's Avatar

    Still the same problem. This is the error I get:

    An error occurred: Could not transform with Script format "_site/transformations/Breadcrumbs": Error on line 13: The entity name must immediately follow the '&' in the entity reference.

    The problem is, there isn't anything on line 13 that should be throwing this, as far as I can see. I'm attaching the script I'm using. Line 13 is a blank line in my code.

  9. 9 Posted by Ryan Griffith on 13 May, 2015 03:30 PM

    Ryan Griffith's Avatar

    Thank you for testing.

    Curious, but I noticed the following line appears to be incorrect:

    ##set ($pageTitle = $pageTitle.replaceAll("&(?!amp;)", '&amp;'))
    

    Perhaps this regex might still work. Let's try correcting this to see if you still get the error:

    #macro (GetPageName $page)
        ## Reset the page title variables
        #set ($pageTitle = "")
        #set ($pageDisplayName = "")
        #set ($thePageTitle = "")
        
        ## Grab page "title"
        #set ($pageTitle = $page.getChild("title"))
        
        ## Grab page "display-name"
        #set ($pageDisplayName = $page.getChild("display-name"))
        
        #if ($pageDisplayName && $pageDisplayName.value != "")
            #set ($thePageTitle = $pageDisplayName)
        #elseif ($pageTitle && $pageTitle.value != "")
            #set ($thePageTitle = $pageTitle)
        #else
            #set ($thePageTitle = $page.getChild("name"))
        #end
        
        #set ($thePageTitle = $thePageTitle.replaceAll("&(?!amp;)", '&amp;'))
    #end
    

    The problem is, there isn't anything on line 13 that should be throwing this, as far as I can see. I'm attaching the script I'm using. Line 13 is a blank line in my code.

    This error is referring to the transformed content and not the Format itself. In this case, the line numbers won't line up to what you see in the code editor.

    Please let me know if you have any questions.

    Thanks!

  10. 10 Posted by webadmin on 13 May, 2015 04:18 PM

    webadmin's Avatar

    Same issue. As for that line of code with the error, I had it commented out for a while now. I assume that wouldn't affect anything once it's commented, right?

    Is there any way to force Cascade to process the page anyway? I feel like I could fix this if I could figure out where it's seeing an ampersand. The display name is the one with the ampersand in it. From the code, it looks like no matter where the ampersand comes from, it should be cleared out by the last line of the code in the GetPageName macro. It shouldn't be throwing an error just by having an ampersand in a variable, should it?

  11. 11 Posted by Ryan Griffith on 13 May, 2015 04:50 PM

    Ryan Griffith's Avatar

    Bummer, thank you for trying anyway.

    I assume that wouldn't affect anything once it's commented, right?

    Correct, it will not be executed as long as you have the line commented out.

    Is there any way to force Cascade to process the page anyway? I feel like I could fix this if I could figure out where it's seeing an ampersand.

    Perhaps try surrounding the entire content with a CDATA tag (<![CDATA[ ... ]]>) to see if you can at least render the invalid XML and see the output.

    Please let me know if you have any questions.

    Thanks!

  12. Support Staff 12 Posted by Tim on 17 Jun, 2015 01:53 PM

    Tim's Avatar

    Hi,

    Just wanted to check in to see if you saw Ryan's last comment. Let us know if you need more help.

    Thanks

  13. Tim closed this discussion on 30 Jun, 2015 06:43 PM.

Discussions are closed to public comments.
If you need help with Cascade CMS please start a new discussion.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

 

26 Aug, 2016 01:19 PM
25 Aug, 2016 03:02 PM
25 Aug, 2016 12:50 PM
24 Aug, 2016 08:43 PM
24 Aug, 2016 07:20 PM
21 Aug, 2016 01:20 PM