A study on RSS - Part 1 XML DOM
RSS, today someone asked me, “what is this?”
Well, I know what it is, but at that moment the words failed me, especially the “non-programmer” words, so as to explain to “the common folk” the true meaning of it and some of its uses. So i decided to write this series of posts, don’t know how many or which sequence I’ll follow but i guarantee i will try to cover all the basics.
The wikipedia defines RSS as a xml specification of web feeds used for web syndication of site content.
“So what?” you say, WTF you exclaim!.. well let’s take it in small steps..
XML is a web format largely used in web environments (technical information: used in general for encoding any structured data), a kind of global language defined for communication between sites, and systems. Being a standard means everyone understands its syntax, something like grammar i would say.
So that doesn’t get you gears going? Guess you are not a developer, so I’ll explain what it’s good for, but in case you are a developer, stick around and I’ll initiate the road to building a RSS Feed.
RSS is the “language” spoken by feeds, which are really like a summary of a site’s contents (RSS - Rich Site Summary), it sends summarized data (title, description and link) of the site contents. For example, in the site ComuniWEB (Brazilian news site) where I’m currently employed the RSS feed dishes out the latest news published on the site. Considering this i ca use a “RSS Reader” and view the latest headlines, and in case any of them catches my eye, i just click on it and it sends me directly to the full story.
You can check out an example in the sidebar under the title “ComuniWeb - Últimas”. This section reads the last 5 headlines from ComuniWEB.
Ok, so now you maybe understand a little bit more about RSS Feeds, go play around for a bit find a RSS Reader like Google Reader , try out a few feeds, you can even add them to personal sites like myYahoo.
Ok so you are a developer, you have a blog, a news site or some content you like to share? Want to know how to do it? Ok I’ll start you on your way. In today’s post I’m going to start on the basics, creating a XML file from PHP data. In the next one I should get into the RSS file structure and then who knows? parsing RSS.
Fisrt Step: XML Compatibility on PHP
PHP has a few options for reading and writing XML, from the bedrock basic direct string creation to solid objects like DOM. Well if i have to pick one, DOM is the winner, it’s stable and has a simple but powerful logic, once you get the hold of it.
Let’s create a file with the following structure
< ?xml version="1.0" encoding="utf-8"?> Brazil loses the World Cup index.php?idpag=20&idmat=162696
First we need to create a new XML Doc via DOM
$xmlDoc = new DOMDocument('1.0', 'utf-8'); $xmlDoc->formatOutput = true;
Now the $xmlDoc variable contains a DOM Object instance and using the “formatOutput” function we tell it to come out tidy-looking. Next we will create the root element, called “news” in the example:
$news = $xmlDoc->createElement('news'); $news = $xmlDoc->appendChild($news);
In the fist line we create a element called news from the main XML Doc, it has no content. In the second line we attach that new node to the root node, sending it back to a $news variable so as to have a up-to-date reference to the node.
Now we need to create an item node with the attribute “id”, check it out:
$item = $xmlDoc->createElement('item'); $item->setAttribute('id','162696'); $item = $news->appendChild($item);
Once more we create a node, always from the root Doc. Next we set the value of the attribute and its name. In the following line we connect it to the parent node, in this case $news, noticed the difference? This means the item node is now a child of the news node created before.
Now we simply add the Title and the Link nodes to the item node finalizing the addition process, but notice the slight difference here:
$title = $xmlDoc->createElement('title',utf8\_encode('Brazil loses the World Cup')); $title = $item->appendChild($title);
$link = $xmlDoc->createElement('link',htmlentities('index.php?idpag=20&idmat=162696')); $link = $item->appendChild($link);
Notice that this time around we set a value to the node. For my Portuguese example I note a few points: 1 - Convert latin chars to UTF-8, here using the utf8_encode(); 2 - Convert HTML entities, XML has problems with & because it represents an entity, use this syntax & amp; instead;
Ok, almost there. Now we need to finalize and spit it out to the screen:
header("Content-type:application/xml; charset=utf-8"); echo $xmlDoc->saveXML();
First we send out a header to guarantee the browser interprets the right content and echo out the result using the saveXML function, which can spit out a string or save a file.
Check the complete script below and a working example here
< ?
$xmlDoc = new DOMDocument('1.0', 'utf-8'); $xmlDoc->formatOutput = true;
$news = $xmlDoc->createElement('news'); $news = $xmlDoc->appendChild($news);
$item = $xmlDoc->createElement('item'); $item->setAttribute('id','162696'); $item = $news->appendChild($item);
$title = $xmlDoc->createElement('title',utf8\_encode('Brazil loses the World Cup')); $title = $item->appendChild($title);
$link = $xmlDoc->createElement('link',htmlentities('index.php?idpaginas=20&idmaterias=162696')); $link = $item->appendChild($link);
header("Content-type:application/xml; charset=utf-8"); echo $xmlDoc->saveXML();
?>
So I’ll wrap it up here for now. Hope this simples RSS introduction has already cleared some of your questions and doubts in the XML and DOM fields. Next time around i’ll get into the RSS Specifications and it’s evolution from 0.91 to 2.0 know now as Really Simple Syndication.