Reading XML with PHP, Extracting the Data

In the first part of this tutorial, we designed an XML file that contained categories and resources for reference links. In this part, we will use PHP to extract the information from the XML document and generate an HTML file from the information.

Extracting the Data from XML File

Making Sure The File Is There

As a good practice, we want to first make sure the file exist before we start pulling information from it. We will use the function file_exists() to make sure the file is there (some evil co-worker may have deleted it while we weren’t looking).

if (file_exists("webdoc.xml")) {
else {
    echo "<b>FILE: webdoc.xml does not exist</b>";

If the file exists, file_exists() will return a true and the function process_xml() is run. If the file does not exists, we print out a statement in bold saying such.

process_xml() Function

The process_xml() function is the heart of the script. Here all the work is done.

function process_xml() {
     $docs = simplexml_load_file("webdoc.xml");

    foreach ($docs->category as $category) {
       echo "
<h2>" . $category["name"] . "</h2>
       echo "
        foreach ($category->doc as $doc) {
            echo "
	<li><a href='" . $doc->link . "'>" . $doc->name . "</a></li>
        echo "</ul>

We use the function simplexml_load_file() function built into PHP to extract the information from the file. The variable $docs becomes a SimpleXML object that we can traverse just like we would any other object.

If you will remember the layout of XML document, the structure was documents->category->doc. The variable $docs takes the position of documents. In our outermost foreach loop, we process each category. There we extract the name of the category and place it inside h2 tags and start an open tag for an unordered list. If you will remember, the name of the category was an attribute of the category tag. Attributes become an associative array of the element. Therefore, we access it as we would any other associative array element: $category[“name”].

In the inner foreach loop, we pull the information from each resource in the category, creating a list item and a link. The link and name of the resource are pulled out using the object pointer (->). Thus, $doc->link is the string inside the link tags, and $doc->name is the string inside the name tags.

We end the first foreach loop with a closing unordered list tag. The end result is headings with a list of resources under each, which we can format with CSS. If we need to add a new resource to our list, there is no need to mess with the HTML and formating, we just add it to our XML document.


This entry was posted in PHP and tagged , .

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s