On a hunt for some RSS trickery. The objective is to capture RSS data in such a way as to calso create archives of the data. It’s a hard Googleslog, as the functionality I’m after is in demand by SEO types and consequently there are a raft of “FOR ONLY $50 I WILL SHOW YOU HOW TO BUILD A NO_LABOR TRAFFIC MAGNET WEB SITE ON THE INTARWEB” baloney.
Here’s what I have come up with, for personal reference.
Magpie RSS seems to be the default tool used for any and all server-side RSS post processing activities.
Lilina is a server-side news aggregator. It’s pretty configurable but relies on an aging-out scheme to dispose of older news items. I haven’t found a way to actually write out persistent archives, but the default collection and presentation is based on time, so this might work.
SimplePie, like MagpieRSS, seems to be a popular RSS munger. However, there is a dearth of projects leveraging it. There is a comprehensive how-to in the now-closed support forum on snipping off a cold HTML archive.
The very closest I have gotten, however, is the maybe-too-simple RSSMinisite, which does exactly what I want, but which has some design flaws. It outputs each RSS item to a flat text file in a data archive, which is great, but the filenames are drawn from the ‘title’ element of the item directly, with no error checking. This leads to file overwrites. Restructuring the file-write to incorporate date-time stamps plus a partial title element would solve that problem, and adding a date-based subdirectory creation routine would solve any concerns about overpopulated directories, at least in the short term.
The true short term goal I’m after, though, is just a non-item-count limited RSS update list. For that, I think Lilina meets my needs sufficiently.
(UPDATE: Fixed Lilina link.)