tag:help-archives.hannonhill.com,2010-02-09:/discussions/how-do-i/21188-options-for-indexing-a-lot-of-data-in-one-folderCascade CMS: Discussion 2016-02-26T21:56:01Ztag:help-archives.hannonhill.com,2010-02-09:Comment/390830932016-02-05T21:10:49Z2016-02-05T21:10:49ZOptions for indexing a lot of data in one folder<div><p>Hi Geoff,</p>
<blockquote>
<p>For context, we need reports of various data combinations from
all content blocks in that folder.</p>
</blockquote>
<p>Do these reports need to be visible within the CMS (by viewing a
Page containing the results, for example) or would the information
work for you outside of the application? The reason I ask is
because Web Services could potentially be an option, although
performing Read operations on that many assets isn't going to be
very easy on the system in terms of resource usage either.</p>
<blockquote>
<p>-- Increase RAM: Our CMS currently has 8GB and the Max. Rendered
Size is set to 40.</p>
</blockquote>
<p>That setting is already configured to the highest value I've
ever seen. Prior to that, the highest I had seen was around 20MB.
As you are probably already aware, raising that even more can put a
real strain on system resources and cause performance problems
system-wide for all users. So, I probably wouldn't recommend
increasing this any more at this point.</p>
<blockquote>
<p>-- Index using multiple index blocks: Is that even possible? For
example, can I create eight index blocks to index 1000 assets each
without overlap?</p>
</blockquote>
<p>This isn't really possible unless you divide the content into
different Folders, then create separate Index Blocks for each of
those Folders. Even so, my guess is that you'd want to aggregate
all of those Index Blocks which would essentially lead to the same
resource usage as one large Index Block. One minor benefit to this
approach might be that the Index Block cache could work more
efficiently, but I'm not sure how much of a noticeable difference
you might see.</p>
<p>I'll wait to hear from you regarding my first question and
perhaps we can offer some recommendations based on that.</p>
<p>Thanks</p></div>Timtag:help-archives.hannonhill.com,2010-02-09:Comment/390830932016-02-05T21:27:25Z2016-02-05T21:27:25ZOptions for indexing a lot of data in one folder<div><p>Thanks for your feedback, Tim.</p>
<p>To answer your first question, no, these particular reports
don't need to be visible within the CMS. As for Web Services, we've
never written any and have barely cracked the book on how to create
them.</p>
<p>As for RAM, are you recommending that even if we doubled our RAM
to 16GB that you wouldn't recommend increasing the Max. Rendered
Size? (Somewhere on your site there's mention of a general rule of
5MB per 1GB of RAM, so I presumed that would scale up as needed...
although I realize somewhere we'd hit the point of diminishing
returns.)</p>
<p>As for aggregating multiple index blocks into one, in this
particular case, we don't need to; having eight separate reports
would work just as well.</p>
<p>Thanks again.<br>
Geoff</p></div>geofftag:help-archives.hannonhill.com,2010-02-09:Comment/390830932016-02-05T21:49:22Z2016-02-05T21:49:22ZOptions for indexing a lot of data in one folder<div><p>Hey Geoff,</p>
<blockquote>
<p>Somewhere on your site there's mention of a general rule of 5MB
per 1GB of RAM, so I presumed that would scale up as needed...
although I realize somewhere we'd hit the point of diminishing
returns.</p>
</blockquote>
<p>Yea, the recommendation you saw is correct, but the diminishing
returns is also something that I feel kicks in anytime we're
talking about rendering a document over ~15MB. The assembly of that
much data takes a lot of time, but also just making it available to
view can be problematic (for example, if you've ever tried to open
a 15MB XML document in your browser, it generally doesn't work too
well).</p>
<blockquote>
<p>As for aggregating multiple index blocks into one, in this
particular case, we don't need to; having eight separate reports
would work just as well.</p>
</blockquote>
<p>This may be your best bet then. If you can find a way to
logically divide these blocks out into different folders, you can
then index each folder with its own Index Block.</p>
<p>If your data happened to be included in Pages as opposed to
Blocks, I was considering recommending publishing to a database and
then creating your reports by querying that directly. Since Blocks
aren't publishable assets this won't work in this case.</p>
<p>Let me know what you think about splitting into different
folders and indexing each of those individually.</p>
<p>Thanks</p></div>Timtag:help-archives.hannonhill.com,2010-02-09:Comment/390830932016-02-05T22:01:02Z2016-02-05T22:01:39ZOptions for indexing a lot of data in one folder<div><p>Thanks, Tim.</p>
<p>As for Pages, that might be possible. We'd need to update the
transform (and maybe the template?) to publish the metadata we
need. Ideally that metadata wouldn't be accessible to our external
users, but maybe we can create a second format just for this
purpose and publish to our QA site. Hmm...</p>
<p>As for dividing the folder into smaller chunks, it's not
practical to do right now unless we make copies of the blocks when
we need to run the report... which isn't horrible, but it's not
ideal.</p>
<p>Cheers,<br>
Geoff</p></div>geofftag:help-archives.hannonhill.com,2010-02-09:Comment/390830932016-02-06T13:56:00Z2016-02-06T13:56:00ZOptions for indexing a lot of data in one folder<div><p>Hi Geoff,</p>
<p>When you work with large amount of assets, your best bet is to
use web services. I strongly encourage you to check out my <a href="http://www.upstate.edu/cascade-admin/projects/web-services/courses/online-tutorials.php">
online tutorials</a>, which are still going on, for the following
reasons:</p>
<ol>
<li>You can run your program daily, possibly after
midnight<br></li>
<li>You can write any reports to a database or create XML
files<br></li>
<li>Generating reports using my library is relatively easy, and I
have been posting programs on <a href="https://github.com/wingmingchan/">github</a></li>
</ol>
<p>Let me know if you need any pointers.</p>
<p>Wing</p></div>Wing Ming Chantag:help-archives.hannonhill.com,2010-02-09:Comment/390830932016-02-09T18:26:26Z2016-02-09T18:26:26ZOptions for indexing a lot of data in one folder<div><p>Thanks, Wing. (Bumping web services up another rung on my
priorities list.)</p></div>geofftag:help-archives.hannonhill.com,2010-02-09:Comment/390830932016-02-09T18:55:24Z2016-02-09T18:55:24ZOptions for indexing a lot of data in one folder<div><p>Hey Geoff,</p>
<blockquote>
<p>As for dividing the folder into smaller chunks, it's not
practical to do right now unless we make copies of the blocks when
we need to run the report... which isn't horrible, but it's not
ideal.</p>
</blockquote>
<p>I had forgotten about this part of it, so I'm going to go back
to my initial feeling (and Wing's) that Web Services will be the
way to go for this. Let us know how things go if you decide to take
this on!</p>
<p>Wing, thanks for sharing links to your library!</p></div>Timtag:help-archives.hannonhill.com,2010-02-09:Comment/390830932016-02-10T13:57:39Z2016-02-10T14:18:28ZOptions for indexing a lot of data in one folder<div><p>I have just posted a web service program on <a href="https://github.com/wingmingchan/php-cascade-ws-ns-examples/blob/master/xml_feed/index_block_xml.php">
github/wingmingchan</a>, showing how to use web services to
retrieve index block XML and output it as a feed. A feed block can
then be used to read the feed and make the XML available to
Velocity. This approach can be modified to generate reports and
make them available to Cascade. We can also use AJAX to read the
feed directly without using a feed block.</p></div>Wing Ming Chan