Single or multiple feed aggregation?

This article is over a year old and may contain outdated information.

I recently published a beta version of my feeds plugin for CakePHP. This plugin was previously the FeedAggregator component, but it made more sense to break it up into a datasource and model, and finally package as a plugin. The datasource is pretty straight forward as it accepts an array of URLs (RSS feeds), fetches each one through an HTTP request, parses the XML into an array, and then returns the result. The model is simply there for your convenience.

Now the dilemma I am running into is whether or not the datasource should only parse one feed at a time or multiple feeds (currently this). It can go either way: the datasource parses multiple feeds and uses the model to return them, or the datasource parses one feed and the model manages multiple connections and merging. Now the big question for you guys... Should the datasource parse one feed at a time or multiple feeds?

Currently you use the model to pass an array of URLs (through the conditions option), the limit, which fields (elements in the XML) you want returned, and some cache/feed settings. Here is a quick example:

// Multiple feed parsing
$feeds = $this->Aggregator->find('all', array(
	'conditions' => array(
		'Starcraft 2 Armory' => 'http://feeds.feedburner.com/starcraft',
		'Miles Johnson' => 'http://feeds.feedburner.com/milesj'
	),
	'feed' => array(
		'cache' => 'feedCacheKey',
		'expires' => '+24 hours',
		'explicit' => true
	)
));

And I am assuming single feed parsing would look something like this:

$feed = $this->Aggregator->find('first', array(
    'conditions' => array('http://feeds.feedburner.com/milesj'),
    'feed' => array(
        'cache' => 'feedCacheKey',
        'expires' => '+24 hours',
        'explicit' => true
    )
));

I am kind of split on how I should go about this and would really love your opinion. I am currently leaning towards multiple feed parsing (current implementation), but if someone has a good argument in not doing so, I will change it.

5 Comments

  • I can't currently see a use for this in any of the projects I'm currently planning out, so without looking at it from a technical perspective, I would tend to agree with basically everything Sam D mentioned.

    Oh, I see...it's like a plugin replacement for something like MagpieRSS but with extra features. I still agree with Sam D though.
    Brendon Kozlowski ⋅
  • @Fabio Sussetto - I hear where your coming from and that makes a lot of sense. I may need to test case this.

    It just seems unnecessary to have to create a model for each RSS feed you want to parse.
  • It seems that it should do both, the only case I can think of where one might feel like restricting to just one feed would be the case if errors in feeds were fatal to the whole request. In my experience with feeds there is a lot of variance and just plain bad xml. As long as you handle errors gracefully and give some granularity in the error reporting I can't see why one would not offer many parsed at a time.
  • For me, the best approach the following one:
    the Datasource in your plugin should emulate core Cakephp datasources. Think of multiple feeds as multiple tables in a database. You don't have a datasource for each table, but one which offers methods to interact with the source of data.
    Following this analogy, I would suggest to drop the Model from your plugin and refactor its functionalities as a Behavior.
    This way, each model in your App will modelize only its kind of data. For instance, you will have a CnnNews model and a WeatherPrevision model, both acting as a Rss.
    This way your plugin is really portable across multiple application, independently from the data which model will represent.
    This is just a quick thought, hope you'll find it useful.
    Fabio Sussetto
  • Also if you have any suggestions for new features, let me have em!