Add RSS feeds to your Web site using MagpieRSS

By Gary Sims

Syndication of material from other sites is a good way to get fresh content on your site. As visitors arrive at your site they can see a teaser for the syndicated content and a link to the publisher. If they are interested in an item they can follow the link to the original location. You can add syndicated content to your site using the Really Simple Syndication (RSS) protocol and a bit of PHP code in the form of an application called MagpieRSS. Here's how.

MagpieRSS is an RSS parser written in PHP. It supports RSS 0.9 and 1.0, with some RSS 2.0 support. MagpieRSS is a simple object-oriented backend which includes automatic caching of parsed RSS to reduce load on external Web sites. To use MagpieRSS on your Web site you will need PHP 4.2.0 (or greater) with XML (expat) support or PHP5 with libxml2 support.

Download MagpieRSS. Extract the four main files (rss_fetch.inc, rss_parser.inc, rss_cache.inc, and rss_utils.inc) and the directory extlib and copy them to a directory named magpierss in the document root (where your web pages are stored) on your Web server.

Next, decide where the syndicated content will go on your site. If you want the content to appear on your front page then most likely you need to edit index.php. If you only have an index.html (not .php) then rename it to index.php as now you are adding some PHP code to it. Edit the file and add the following line at the top:

	<?php
	require_once('magpierss/rss_fetch.inc');
	?>

To actually fetch and parse the RSS add a call to fetch_rss() in your web page.


	<?php
	$rss = fetch_rss("http://www.newsforge.com/index.rss");
	?>

replacing the newsforge.com URL with the URL of the content you are syndicating.

This function will return an array ($rss) that contains the syndicated content as an array of syndicated items along with some publisher information, such as the the name of the publisher (which is stored in $rss->channel['title']) and an optional description of the publisher (e.g. "The Online Newspaper for Linux and Open Source") which can be found in $rss->channel['description']. There is also $rss->channel['link'] which contains a general link to the publisher.

The syndicated items can be accessed via $rss->items. A simple loop can be used to transverse the items one at a time:

	foreach ($rss->items as $item)
	{ // Your code here
	}

Each item contains a title, link and description. $item['title'] contains the title of the article or story, $item['link'] is the link to the original and $item['description'] is the description of the story which is often the introduction paragraph or some sort of summary.

To display a simple list of syndicated items use the following code:

<?php
require_once('magpierss/rss_fetch.inc');
$rss = fetch_rss('http://www.newsforge.com/index.rss');
	
echo "<a href=".$rss->channel['link']."><B>".$rss->channel[  'title']."</B></a>";

foreach ($rss->items as $item) {
	$href = $item['link'];
	$title = $item['title'];	
	$desc = $item['description'];	
	echo "<P><a href=$href>$title</a><BR>";

	if($desc)
		echo $desc;
}
?>

Of course, the output is simple HTML, but from here you should be able to expand the code to suit the style and design of your site.

Lengths

In the 0.91 version of the RSS specification the title is restricted to 100 characters and the description to 500. However there are no length limits in RSS 0.92 and greater. This can cause a problem for your Web site. Keeping a consistent look and feel to your Web site is important. If syndicated items are displayed too freely your site may start to look out-of-joint. Imagine you have similar code to the above listing news items in a side bar. What will happen if the description of a particular item is several thousand characters long? Or what if there was 50 items being syndicated?

Length validation is an important aspect to putting syndicated items on your site. To limit the number of items displayed, the easiest thing to do is slice the resulting array into a smaller chunk. After calling fetch_rss(), but before any processing, add the following line:

$items = array_slice($rss->items, 0, 10);

Which will shrink the array to contain only the first 10 items. Now the foreach loop can be run without any worries.

To validate the length of the description use the strlen() function:

if (strlen($desc) >= 80)
{
	$desc = substr($desc,0,79)."...";
}

Here the description is shortened to be less than 80 characters with ... at the end to show that there is more text available. If you have shortened the description you might what to offer a link to the full syndicated text rather than a link to the original. That way users remain on your site longer. The full version of the syndicated item would then contain the link to the original. To do this you will need a script called something like readmore.php which uses MagpieRSS to display the full text as the main item on the web page along with your normal navigation, side bars and advertising. The parameters to the readmore.php would be the URL of the RSS feed and the item number you wish to display. One thing to watch is that some descriptions contain HTML as well as plain text. Blindly chopping the string could cause the string to be cut half way through a tag. Smart chopping is needed but that is beyond the scope of this article. However a crude solution is to use the PHP function strip_tags() which will remove all the HTML tags from the string. The full text version of the item displayed by the readmore.php can leave the tags intact.

Caching

To speed up your Web site and save unnecessary traffic for the publishers web server it is important to set up caching. MagpieRSS has built-in automatic caching. With caching MagpieRSS will only fetch and parse RSS feeds when the cache is too old.

To enable caching add the following lines to your script:

        define('MAGPIE_CACHE_DIR', '/tmp/mysite_magpie_cache');
        define('MAGPIE_CACHE_ON', 1);

By default MagpieRSS will cache items for one hour. This can be overridden using:


        define('MAGPIE_CACHE_AGE', 1800);

Where 1800 is the number of seconds to cache objects (e.g. 30 minutes, 60*30).

Conclusion

MagpieRSS makes it easy to add syndicated content to your Web site, with only a small time investment you can improve the content of your site. Remember "content is king."

Gary Sims has a degree in Business Information Systems from a British university. He worked for 10 years as a software engineer and is now a freelance Linux consultant and writer.