XML_PullParser
A token-based interface to the PHP expat XML library
version 1.3.2
Myron Turner
Examples of Coding Namespace Support for XML_PullParser

Contents         


XML_PullParser_setAttrLoop
The first example shows how to use the XML_PullParser_setAttrLoop family of methods with XML_PullParser_getAttrVal . As a reminder, the format of the array returned by XML_PullParser_nextAttr is as follows:

 Array
  (
   [0] => ELEMENT_NAME
   [1] => ATTRIBUTES_ARRAY
       (
           [KEY_1] => VALUE_1
           [KEY_2] => VALUE_2
       )
   [2] => ELEMENT CDATA OR  ""
  )

For each element found having an attribute, XML_PullParser_setAttrLoop creates a three-element indexed array with the following structure.

The AttrLoop family of methods is used to create arrays of elements that have attributes. One of the side-effects of XML_PullParser_NS is that every element now has at least one internally coded attribute named _ns_ assigned to it which holds to the namespace for that elemement. The practical effect of this is to open up added possibilities for XML_PullParser_setAttrLoop_elcd , because it will now always include all the elements found in the target token in its indexed array.

The target token is (by default) either the $current_element or the current token 1 or an optional tokenize array passed in as a parameter. If the target token is the one returned by XML_PullParser_getToken , as in movies-8.php listed below, all attributes and character data of potential interest are stored in the attribute loop array. Because XML_PullParser_NS can be used whether or not there are namespaces in a document, it is possible to use this class for any document where XML_PullParser_setAttrLoop_elcd would provide a clean interface to the document data.

movies-8.php

while($token = $parser->XML_PullParser_getToken()) {

         // since XML_PullParser_getElement has not been called
         // XML_PullParser_setAttrLoop_elcd will use $token
       $attr_vals = $parser->XML_PullParser_setAttrLoop_elcd();
       while($at = $parser->XML_PullParser_nextAttr()) {
       if($at[2]) {
          echo "$at[0]: $at[2]\n";
       }
       foreach($at[1] as $attr_name => $attr_value) {
           $name = "";
          if(preg_match('/DAY/i',$attr_name)) {
            $name = "day";
          }
          if(preg_match('/month/i',$attr_name)) {
            $name = "month";
         }

         if($name) {
            echo "$name:  " . $parser->XML_PullParser_getAttrVal($name, $at[1]) . "\n";
         }
         if($at[0] == 'LEADING_MAN') {
               echo "\n";
         }
      }

     }
}

/*
Result

DATE: 1939
day:  25
month:  Apr
LEADING_LADY: Vivien Leigh
LEADING_MAN: Clark Gable

DATE: 1941
day:
month:
LEADING_LADY: Maureen O'Hara
LEADING_MAN: Walter Pidgeon

DATE: 1993
day:  15
month:  June
LEADING_LADY: Laura Dern

*/


XML_PullParser_getAttrValues
XML_PullParser_getAttrValues takes one parameter, a one-element associative array in which the key is the name of an xml child element enclosing any number of attributes and the value is either the name of the parent element (string) or a tokenized array which is its parent:
array($child=>$parent)
Its main use is where there is more than one element of the same name in a token. It returns a numerically indexed array of the attributes found in each element:


	Array
	(
	    [0] => Array
	        (
	           [KEY_1] => VALUE_1
	           [KEY_2] => VALUE_2
		)

	    [1] => Array
	        (
	           [KEY_1] => VALUE_1A
	           [KEY_2] => VALUE_2A

	        )

	    [2] => Array
	        (
	           [KEY_1] => VALUE_1B
	           [KEY_2] => VALUE_2B
	        )

	)

Here is a snippet of code which uses XML_PullParser_getAttrValues .

       $title = $parser->XML_PullParser_getText('title');
       echo "Title: $title\n";
       $attr_vals = $parser->XML_PullParser_getAttrValues(array('date'=>$token));
       echo "Month:  " . $parser->XML_PullParser_getAttrVal('month', $attr_vals[0]) . "\n";;
       echo "Day:  " . $parser->XML_PullParser_getAttrVal('day', $attr_vals[0]) . "\n";;

/*
Result

Title: Gone With The wind
Month:  Apr
Day:  25

*/


Switching Between Namespace Definitions
XML_PullParser_setCurrentNS is a class method and can be called at any time during the processing of the XML document. This makes it possible, for instance, to change the namespace definition so as to extract only those elements belonging to a particular namespace and then to switch back to a previous definition. (See the earlier section on namespaces and the $which parameter.)


Sample Script
Below is an example script using XML_PullParser with namespace support, and the resulting output. The coding is exactly as it would appear if we were using XML_PullParser instead of XML_PullParser_NS , with only two difference. These are:
  1. we call XML_PullParser_NamespaceSupport(true) before calling the constructor:
  2. we call XML_PullParser_setCurrentNS in order to set up the namespace definition.
The XML document is "Movies.xml", which is included in the Listings directory of this distribution and reprinted for convenience below.

movies-7.php

	$tags = array("Movie");
	$child_tags = array();

	XML_PullParser_NamespaceSupport(true);
	$parser = new XML_PullParser("Movies.xml", $tags,$child_tags);

	$parser->XML_PullParser_setCurrentNS("http://room535.org/movies/title/|"
	  . "http://room535.org/movies/mov/|http://room535.org/movies/star/|"
          . "http://room535.org/movies/dates/");


	while($token = $parser->XML_PullParser_getToken()) {

	       $title = $parser->XML_PullParser_getText('title');
	       $leading_man = $parser->XML_PullParser_getText('leading_man');
	       $leading_lady = $parser->XML_PullParser_getText('leading_lady');

	       $year = $parser->XML_PullParser_getText('date');

	       $attr_array = $parser->XML_PullParser_getAttributes('date');
	       $month = $parser->XML_PullParser_getAttrVal  ('month',$attr_array);
	       $day = $parser->XML_PullParser_getAttrVal  ('day',$attr_array);

	       echo "Title: $title\n";
	       echo "Date: $month $day $year\n";
	       echo "Leading Lady: $leading_lady\n";
	       if($leading_man) {
	        echo "Leading Man: $leading_man\n";
	       }
	       echo "\n\n";

	}

	echo "\n </pre>\n";


	/*
	Result

	Title: Gone With The wind
	Date: Apr 25 1939
	Leading Lady: Vivien Leigh
	Leading Man: Clark Gable


	Title: How Green Was My Valley
	Date:   1941
	Leading Lady: Maureen O'Hara
	Leading Man: Walter Pidgeon


	Title: Jurassic Park
	Date: June 15 1993
	Leading Lady: Laura Dern

	*/

The default namespace is "http://fedora.gemini.ca/local/". It has not been included in the namespace definition. Consequently, leading_man is not returned for "Jurassic Park" because it does not have a namespace assigned to it. The same is true for month and day of "How Green Was My Valley".


Movies.xml
Movies.xml is the file which is used for many of the examples in the manual. It will be found in the listings directory of this distribution.

Movies.xml

 <Movies
 xmlns = "http://fedora.gemini.ca/local/"
 xmlns:mov = "http://room535.org/movies/mov/"
 xmlns:star = "http://room535.org/movies/star/"
 xmlns:title = "http://room535.org/movies/title/"
 xmlns:date = "http://room535.org/movies/dates/">
  <Movie>
     <title:Title>Gone With The wind </title:Title>
     <date:date date:day="25" date:month="Apr">1939 </date:date>
     <star:leading_lady>Vivien Leigh </star:leading_lady>
     <star:leading_man>Clark Gable </star:leading_man>
  </Movie>

   <mov:Movie>
     <title:Title>How Green Was My Valley </title:Title>
     <date:date day = "2" month="May">1941 </date:date>
     <star:leading_lady>Maureen O'Hara </star:leading_lady>
     <star:leading_man>Walter Pidgeon </star:leading_man>
  </mov:Movie>

  <Movie>
  <title:Title>Jurassic Park </title:Title>
     <date:date date:day="15" date:month="June">1993 </date:date>
     <star:leading_lady>Laura Dern </star:leading_lady>
     <leading_man>Sam Neil </leading_man>
  </Movie>
 </Movies>

Notes
1. $current_element is created by XML_PullParser_getElement and the current token by XML_PullParser_getToken .