Need to Create Browsable News Archive

eo362362's Avatar

eo362362

13 Jul, 2016 06:32 PM

Pardon me if the solution to this is obvious. I am quite new to Cascade, though I have been working in web applications in general for a while.

Like a lot of people, I have a "News Index" page, which displays news articles from several different categories. For simplicity's sake let's say these categories are "Academics" and "Athletics." The main index page displays articles from both categories, and each of the two categories also has its own index page where only those pages tagged appropriately display. This was easy to set up using the query tool.

What I need to have happen now is have articles expire from the main index, while still being browsable from an archive index page. Ideally, I would be able to filter this archive by year and category dynamically. I know there is a built in "End Date" field, but I am not so certain that it sets me up to accomplish what I need to do.

I don't want these page to actually disappear from the live site, and they need to be indexable at least in the sense that I can build another page, the "Archive Index" page, that will present the user with links to browse the stories by year. So they need to expire from the main index, but then I need to be able to pull content from them (Headlines, Dates, and Link values) in order to create a separate index for users to explore.

Is this making sense? If it isn't, how can I clarify it so that there can be a discussion? Does anyone know of any examples where this is already being done?

  1. 1 Posted by Penny on 13 Jul, 2016 08:33 PM

    Penny's Avatar

    Hi,

    It sounds like you have a couple of approaches that you could take. The end date would make it not be able to be published again but it would not automatically unpublish the article. The End Date also means that it would not be picked up by Index Blocks so those articles would not show in listings if you were using an Index Block and XPath. The Query Tool does have a property to get items that are not set to be Indexed, but I am not sure if this applies to items that have hit their End Date as well. #set($query = $query.indexableOnly(false))

    If you are just using the Query Tool, I would recommend trying to add this clause for your archive and then the opposite for the active page.

    Another idea that I thought of was having a check box in the custom metadata to say Expire in so many days, Yes or No. This would rely on you having a specific number of days in mind.

    You could also add a check box to just say, Expire, and then use that to determine what should go on the Archive versus Active Listing.

    My last thought is to add a Date field to the page via the Data Definition for Expiration. Then you don't have to worry about the built in functionality of the End Date but it is harder to filter using the Query Tool.

    I hope this info helps.

  2. 2 Posted by eo362362 on 20 Jul, 2016 06:30 PM

    eo362362's Avatar

    I have tried some variations on what you suggested, but so far each approach runs into a different problem. Let me see if I can give you an outline of how the New is set up (I inherited this structure. My understanding is that it was built out by Hannon Hill). The folders are set up like this:

    +News
        --archive.html
         -yyyy
            --archive.html
                 --mm
                   -- article.html
                   -- article.html
                   -- archive.html

    At the moment, I have two "End Date" fields available to me for the articles. The native metadata field, and a custom one that I added to the data definition. Both work to some extent to expire articles from the main index.

    Where I am having difficulty is in creating the interface to browse through the non-indexed, or "archived" articles.

    At the moment, I am trying to do this using two different formats. The first one, attached to the archive.html page's default region at the parent "news" folder and the "yyyy" folder level, just creates a list of links to let the user drill down to a particular month folder:

    #set($parentFolder = $_.locatePage($currentPagePath).parentFolder)

            #set($folders = $parentFolder.children)
            #set($sortedFolders = $_SortTool.sort($folders, "name"))

            #foreach($folderItem in [1..$sortedFolders.size()])
                #set($arrayIndex = $sortedFolders.size() - $foreach.count)
                #if($sortedFolders[$arrayIndex].children)
                    #set($folder = $sortedFolders[$arrayIndex])
                    
                #if ($folder.shouldBeIndexed == true)

                   #set($folderLink = $_EscapeTool.xml($folder.link))
                    | <a href="${folderLink}/archive" title="$folder.metadata.title Archive">$folder.metadata.title</a>
                #end
        #end
    #end

    That's fine. It's not ideal, but it's fine. Inside the News folder this shows a list of links to 2016 | 2015 | 2014, etc . . . and when a user clicks on those they get to another list of January | February | March | etc . . . which will then lead them to the archive.html page for that month.

    That is where the other format comes in, the one which I would like to retrieve the articles in that specific folder that have expired. With the query tool, if I'm not mistaken, I can't filter my result set enough for it to be useful. I need to show only those items in the current folder. I can do that by filtering after the fact, like this:

    #set ( $query = $_.query() )
    #set ( $news = $query.byContentType("News Article").sortBy("startDate").sortDirection("desc").indexableOnly(false).maxResults(500).execute() )

       #foreach ( $n in $news )
              #if ($currentPage.parentFolder.name == $n.parentFolder.name)
    ... write files
               #end

    But the problem there is the filtering happens too late. I can pare down what the user sees, but the query is going to pull back all of the news articles first and will rapidly need to surpass the built in limit of 500 results as more and more news articles are added and expired.

    I suppose I can try to do this with an index block. I would have to try to use my custom date field in that case to expire the articles, so that they would still be returned to the index. I'm a little fuzzy on how to configure that index block, or how it would work, as I am still getting my bearings and have only used the query tool up until this point.

    Do you mind looking at the structure of the News site I've outlined above and letting me know if this is a logical approach?

    I can't imagine I am the only one who has had to build such and archive. Are there any pointers out there to where maybe someone else has done so?

    Please let me know if I can add more detail to make any of this clearer.

  3. 3 Posted by eo362362 on 28 Jul, 2016 05:25 PM

    eo362362's Avatar

    Ok . . . so I am not getting anywhere with this. Perhaps I will restart the thread more generically.

    The structure I have is:

    -- News
      --YYYY
         --MM
            --articles.html
            --articles.html
         --MM
       --YYYY
           --MM
               --articles.html

    etc . . .

    Expiring the articles from the main index page is easy enough. It is creating the interface to browse the archives that is getting me. What I tried above was more trouble than it was worth. Why do I feel like this has to be easier?

  4. 4 Posted by Wing Ming Chan on 28 Jul, 2016 06:48 PM

    Wing Ming Chan's Avatar

    Hi,

    This is how I understand the problem. Correct me if I am wrong.

    What we need is to be able to filter articles by supplying a range of two long numbers: the start date and the end date. For the main index page, the start date can be the first day of a certain month, say, three months ago (today is July 28, so let's start from April 1st), or consistently 90 days ago, and the end day will be today. Any articles marked with a future day or older than the cut-off date won't come into the picture.

    For an archive index page, the start day will be the first day of a certain year, and the end day will be the last day of the same year. And the year is give in the folder name.

    If we have a macro used to process an index block, then we need to index every page, sort them, and when the macro is invoked, we need to supply both timestamps.

    If we are to use the Locator Tool, then grabbing everything within a folder is easy. We don't need to worry about the start date and end date. Just every article inside the folder. For the main index page, we still need to look at the two or three newest folders of the current year, collect everything, sort them, and apply the range to cut off the old articles.

    We don't need to deal with end dates of articles in the metadate set or in a structured data node. We just need to calculate the cut-off date and ignore everything that is older than that date.

    Am I on the right track?

    Wing

  5. 5 Posted by eo362362 on 29 Jul, 2016 02:38 PM

    eo362362's Avatar

    Hi Wing,

    Thanks so much for taking a look at my question. I have been trying to read everything I can get my hands on that you've written as I try to learn Cascade. It's a lot to take in!

    And yes, I think you are pretty accurately describing what I need to do. For the main index, I do actually need to deal with the end date in a structured data node, because there has to be the ability to arbitrarily expire articles. I think I was able to do that pretty simple by adding another field to the DD, and using this code:

       #set ( $endDate = $n.getStructuredDataNode("content/endDate").textValue)
        #set ( $numDate = $_NumberTool.toNumber($endDate) )
        #set ($today = $_NumberTool.toNumber($_DateTool.date))

    Then

     #if ( $endDate == '' || $numDate >= $today)
     list of articles
     #end

    To trim the main index.

    That seemed to work fine, although I imagine there may be a better way.

    Creating the interface to browse the archive is what is proving challenging.

    Ideally, when a user visited the "archive" page, there would be a hyperlinked list of all of the news folders by Year, with the articles for the most recent year displayed. Then, as the user chose to browse the navigation, they could click on any of the year links and the list would refresh with the articles for the corresponding year.

    The navigation controls should be persistent as they move from page to page, allowing them to jump to any given year at any point. I am not sure if this can be done with just one archive page or if I need several.

    If there is something I can provide that would help make this more clear, please let me know. I've attached the news DD, although it is pretty standard I think.

  6. 6 Posted by Wing Ming Chan on 29 Jul, 2016 03:07 PM

    Wing Ming Chan's Avatar

    Because I have no idea on how you want the users to navigate, I can only design something that I would create.

    1. On each archive.html page within a year folder, I want to list all articles of that year, possibly with teasers/summaries and thumbs. I probably will name such a page index.html instead.

      2015

      • Dec 31, President Smith visited the campus
      • Dec 25, Fund raising party
      • etc.
    2. On the left nav, I want to show archive.html of every year:

      2015 => 2015/index.html
      2014 => 2014/index.html
      2013 => 2013/index.html
      2012 => 2012/index.html
      

    To implement this:

    1. I will use an index block to index all articles
    2. On an archive page of a certain year, invoke a macro by passing in the year:
      #import( 'site://site_for_format/.../processArticlesFormat' )
      #processArticlesByYear( "2015" )
      

    I can also use the parent folder name to recover the year. In the format, collect all articles of that year, sort them in descending order, and display the links. At Upstate, I actually create a page containing the real contents to be included, using PHP, so that the archive page itself does not really have contents. See below for the reason why.

    1. For the main archive page, the one being a sibling of the year folders, I will use an accordion, showing just the years; and when a year is expanded, show all articles of that year. The contents can come from the same page I create above.

    2. Separating the archive pages from real contents, I do not need to republish the archive pages. Instead, I republish the pages containing real contents. All index pages will be updated accordingly without republishing. The published contents can be reused anywhere appropriate.

    3. For the left nav, it is just a simple matter of using symlinks or references to create the menu.

    4. But once I navigate into a year folder, I don't know what you want to show in the left nav.

    To add a bit more meat to this design, I will also use scheduled publishing and web services to automate the entire process.

    Did I answer your questions?

    Wing

  7. 7 Posted by eo362362 on 29 Jul, 2016 05:59 PM

    eo362362's Avatar

    Thanks Wing. I am going to take your suggestions and see what I can do. You have been more than generous with your time and insight, and I have not been clear enough to be helpful, I realize.

    I do see what you are suggesting. I will see how far I can get. My hope for the navigation is for it to be dynamic, so that as new yearly folders are added, links to those archives and their corresponding month folders would be generated automatically along the top, something like:

    2017 | 2016 | 2015 etc . . .

    2017 Archive

    jan|feb|mar|apr|may …

    -Dec 30th — President Smith visits -Nov 28th - New Athletic facility complete -etc

    For each given year, with the years and the months being links.

    I will report back when I get further along. A colleague suggested trying to write a format that would output the article content to JSON and then reading that back in. I suspect he is trying to accomplish something similar to what you have done with PHP.

    I do have access to Cold Fusion here, and I am familiar with it, but I have not done much using Cascade and Cold Fusion together. Is there a particular project of yours that would be worth studying to get an idea of how you are pulling these things together?

    If it is necessary to close this thread, I understand. I did not expect a complete solution. I was just hoping for some pointers to help get me unstuck. You have certainly given me plenty of ideas to work with. Thank you.

  8. 8 Posted by Wing Ming Chan on 29 Jul, 2016 06:12 PM

    Wing Ming Chan's Avatar

    We do generate the left navigation menus dynamically. But it requires that every folder have an index page. To avoid displaying all articles in a folder, I need to do something special so that they are indexed but skipped when the left nav is generated. If everything is set up systematically, then the left menus should not pose any problem.

    I have never generated JSON, so I have nothing to say here. But we do use PHP include extensively.

    We do have a prototype under development: http://www.upstate.edu/bioinbrief/index.php. I did not build this site. Many ideas were mine, but Peter (one of our team members) put in a lot more ideas and built it his own way. I taught him Velocity initially, and he added a lot of fancy things to the site since then.

    One more thing to add to the discussion. The single template we use for responsive sites has only one region (the DEFAULT). Therefore, everything you see must be accounted for without using regions at all. Peter is still trying very hard to deal with this restriction.

    Wing

  9. eo362362 closed this discussion on 26 Sep, 2016 06:59 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac