XML feed size

mdcarter's Avatar

mdcarter

15 Sep, 2011 09:29 PM

There is current a project at our campus to generate an XML file of our Class Sched. We are hoping that we can consume the XML file as a feed in Casacde and allow departments in the WCMS to output schedule information very easily. Currently the plan is to create a single file that then can be parsed by departments. The file is currently looking to be about 60MB. I have some early concerns about pulling a file that size as a feed as well as parsing time via XSLT or Velocity. So I have a few questions for both support and the community.

Is there a maximum or recommend limit on feed size?
Is anyone else processing similarly sized files?
Do you have any optimization suggestions for delivery?

My initial thought would be to explore creating smaller sized XML files based on department or college. We are in the phases of this project and simply identifying the data in the feed and our XML structure. I am hoping to get some insight from others before we get to far down a path.

Thanks

-Matt

  1. 1 Posted by Lee Roberson (F... on 16 Sep, 2011 10:41 PM

    Lee Roberson (Function Digital LLC)'s Avatar

    Interesting ideas Matt.

    We satisfy a similar need by synchronizing these courses as page assets into Cascade via Web Services. This has some advantages, notably that it allows us to deploy a hierarchical master website that allows people to browse all courses in several terms across several schools and departments in one place. We publish a sort of JSON-file database to the web server that a frontend desktop web site reads at each folder level as necessary when the user is browsing. The courses are synchronized into folders, like 4440/WCAS/ANTHRO/111-5/11123 which is term number, school code, department code, course number, and finally a sort of unique course number (it's really just a section of the course but the asset name is not the section number). In addition to the web site, each of these folder levels and courses has a configuration that publishes a mobile friendly view for browsing that way.

    We haven't pulled in many huge XML documents via Feed blocks but I can't imagine it would work super well. One issue with Feed blocks is that if for some reason reading the file fails due to a busy or overloaded web server or a strange character encoding error in the XML file your publish job will continue to run with errors. In the first case, it will probably display nothing on the page(s) and in the second it will at least likely generate an error in either the publish report or the Cascade system log and not publish anything. With the approach we chose above it was really painful to synchronize these assets into Cascade up front (i.e. keeping special characters and things like that out during asset creation) but at least we knew there wouldn't be a problem when rendering pages based on the data. Also, if there IS a problem with just one course, it is basically isolated publishing pages about that single course (and the publish report does give us notice that there was a bad character or some malformed HTML in one of the strings).

    From a performance perspective In the end I suppose you could generate a fake file of about that size and do some tests on it.

  2. 2 Posted by eporter on 30 Sep, 2011 05:18 PM

    eporter's Avatar

    I've been interested in using this technique as well - using a feed block to pull in a non-feed XML file, since it's potentially such an easy approach to get external XML into the system. Based on limited testing, it seems to work well, but I'm not sure if future releases might suddenly require/validate against specific feed formats (RSS/Atom) and therefore reject other XML sources.

  3. 3 Posted by Ryan Griffith on 21 Jun, 2012 05:13 PM

    Ryan Griffith's Avatar

    Hi Matt,

    I was going through some older discussions and saw that this one was still open.

    Were you able to get your class schedule implementation working? If you did, would you be willing to share for others to have a look at what your institution did?

    Thanks!

  4. Ryan Griffith closed this discussion on 02 Jul, 2012 01:05 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac