XML_PullParser
A token-based interface to the PHP expat XML library
version 1.3.2
Myron Turner
Namespace Support

Contents         

Starting with version 1.3.0 Namespace support is built into XML_PullParser . It is invoked by calling the package level function XML_PullParser_NamespaceSupport with a true value before creating a new instance of XML_PullParser :
XML_PullParser_NamespaceSupport(true);
$parser = new XML_PullParser($file, $tags, $child_tags);
For backward compatibility with versions prior to 1.3.0, two stub class files are provided:
XML_PullParser_NS.inc
XML_PullParser_NS_doc.inc
These files call XML_PullParser_NamespaceSupport , so that it's not necessary to call it in the script. These files are used as follows:
require_once "XML_PullParser_NS.inc";
$parser = new XML_PullParser_NS($file, $tags, $child_tags);
require_once "XML_PullParser_NS_doc.inc";
$parser = new XML_PullParser_NS_doc($doc, $tags, $child_tags);


Namespace Methods and Functions
Methods
  1. mixed XML_PullParser_setCurrentNS (string $ns)
    Used to create the current namespace definition
  2. mixed XML_PullParser_unsetCurrentNS ()
    Sets the current namespace definition to NULL
  3. boolean _is_current_NS (array $ns_array)
    Used to test whether an element or an attribute falls within the current namespace definition. This is primarily an internal method but can be used as described below.
  4. string XML_PullParser_getAttr_NS (string $name, array $attr_array)
    Gets the value of an attribute if it falls with the current namespace definition
  5. string XML_PullParser_getNS_URI (mixed $str, [string name=Null])
    Extracts the namespace URI from the internally constructed attribute name.
  6. string XML_PullParser_getNS_AttrName(string $str)
    Extracts the unqualified attribute name from the internally constructed attribute name.
Package Level Function(s)
  1. void XML_PullParser_Disable_NS_Prefixes (mixed $bool)
    This function controls whether namespace prefixes are removed from element and attribute names or left in place. If this function is called with a value of TRUE, prefixes will be removed from all element and attribute names; otherwise, they will remain in place.

    When prefixes remain in place, they are considered to be part of the names. Therefore, dns:server, uri:server, and server are all traeted as separate and distinct names. This is the default behavior and is consistent with releases prior to 1.3.1, when this function was added.

    This function has meaning only when namespace support has not been invoked, since prefixes are replaced by their URI's when namespace support is in effect.


1. XML_PullParser_setCurrentNS
The most important of the new methods is XML_PullParser_setCurrentNS(), which creates the current namespace definition. In its simplest form, it takes a single namespace:
XML_PullParser_setCurrentNS('http://example.com/doc/def/');
But it can also take multiple namespaces, which are also passed into the method as a string but each namespace URI is separated from the others by the vertical bar:
$ns="http://example.com/doc/def/|http://my_site.com/movies/|"     . "http://my_site.com/movies/title/"; XML_PullParser_setCurrentNS($ns);
If successful, this method returns the previously defined namespace if there was one or TRUE if there was no previous namespace definition. If not successful it returns FALSE;

Only those namespaces which have been defined by XML_PullParser_setCurrentNS() will be recognized. Let's look at an example:

	 < Movies
	 xmlns = "http://fedora.gemini.ca/local/"
	 xmlns:mov = "http://room535.org/movies/mov/"
	 xmlns:star = "http://room535.org/movies/star/"
	 xmlns:title = "http://room535.org/movies/title/"
	 xmlns:date = "http://room535.org/movies/dates/"
	>
	  < Movie>
	     < title:Title>Gone With The wind < /title:Title>
	     < date:date date:day="25" date:month="Apr">1939 < /date:date>
	     < star:leading_lady>Vivien Leigh < /star:leading_lady>
	     < leading_man>Clark Gable < /leading_man>
	  < /Movie>
        < /Movies>

Let's assume that the namespace definition were the following:
$parser->XML_PullParser_setCurrentNS("http://room535.org/movies/title/|"     . "http://room535.org/movies/mov/|http://room535.org/movies/star/|"     . "http://room535.org/movies/dates/");
XML_PullParser would locate all of the elements and attributes, except for
<leading_man>Clark Gable</leading_man>
which has no namespace prefix assigned to it. Its namespace is the default namespace:
xmlns = "http://fedora.gemini.ca/local/"
But the default namespace has not been included in the current namespace definition.

If we were to include the default namespace in the definiton, then XML_PullParser would locate leading_man, even though it does not have a namespace prefix. This is because the default namespace applies to all elements which do not have prefixes attached to them.


2. XML_PullParser_unsetCurrentNS
Calling XML_PullParser_unsetCurrentNS unsets the current namespace definition. If there is no namespace definition, then XML_PullParser ignores namespaces and behaves exactly as it would if there were no namespaces. If successful, this method returns the previously defined namespace if there was one or TRUE if there was no previous namespace definiton.


3. _is_current_NS
This is for internal use; it returns true if an element or attribute appears within the current namespace definition. But it can be used by the programmer to determine whether an attribute resides within the current namespace definition. This can be done by extracting the namespace URI from the attribute's name with XML_PullParser_getNS_URI() and then using it as the key in an associative array, which takes the form:
URI=>attribute-value
It's this array that is passed into _is_current_NS() as a parameter:

     $name = XML_PullParser_getNS_URI($name);
     if(is_current_NS(array($name=>$value)) ) {
     }


4. XML_PullParser_getAttr_NS
This method gets the value of an attribute if it falls within the current namespace definition. If the attribute is not within the namespace definition, then this method returns NULL .
Note: NULL doesn't mean that the attribute has no value, only that it does not have a namespace which has been defined by a call to XML_PullParser_setCurrentNS.

Note: This method was designed primarily for internal use but may have applicability in some scripting situations. But for most situations XML_PullParser_getAttrVal should be used to get attribute values.

This method takes two parameters:

  1. string $name
    a string, which is the name of the attribute without its namespace qualification
  2. array $attr_array
    an assocative array consisting of the attribute's name and value, formed as follows: attribute-name=>attribute-value.

attribute-name is the name supplied by XML_PullParser, which is an internally constructed key. The keys can be derived from the arrays returned by one of the following:
XML_PullParser_getAttributes
XML_PullParser_nextAttr
XML_PullParser_getAttrValues
An example of its use is as follows:

	       $attr_array = $parser->XML_PullParser_getAttributes('date');   
	       foreach($attr_array as $name=>$value) {
	             $name = $parser->XML_PullParser_getNS_URI($name);
		     if($parser->_is_current_NS(array($name=>$value)) ) {
	                  echo "$name=>$value is in current namespace\n";
		     }
		}

Examples of how to use XML_PullParser_nextAttr and XML_PullParser_getAttrValues will be found will be found in the next section: Coding Namespace Support for XML_PullParser .


5. XML_PullParser_getNS_URI
This method will search for the namespace URI of either an attribute or an element. If a namespace is found it returns the namespace as a string. If a namespace is not found, it returns NULL .

All attributes are held in associative arrays. When namespace support is not requested, the attribute names serve as keys which point to the attribute values. For instance:
[date]=>1945
If an attribute is assigned to a namespace, a key is created from the attribute name and the namespace URI. This method extracts the namespace URI from the internally constructed key. In the case of elements, the namespace is treated internally as a specially constructed attribute, and this method queries that attribute for the namespace assigned to the element. See the namespace example in the appendix for more detail on how namespaces are treated.

XML_PullParser_getNS_URI takes two parameters:
  1. mixed $str
    the internally constructed attribute name (string) or an attribute array 1
  2. string $name (optional)
    the name of the attribute without its namespace qualification

1. If the parameter $str is a string, this method assumes that it is the internal name of an attribute.
2. If the parameter is an array and $name is not specified, it assumes that the element's own namespace is being sought.
3. If the parameter is an array and $name is specified, it looks for the attribute of that $name.


6. XML_PullParser_getNS_AttrName
This method extracts the unqualified attribute name from the name which is created internally for all namespace-qualified attributes. Its single parameter is a string holding this internally constructed name. It returns the unqualified attribute name, i.e. without the namespace or the namespace prefix prepended. (The namespace prefix is the namespace identifier that is prefixed to attributes and elements assigned to a namespace, as in
identifier:element_name


Atttibutes and Namespaces
Because XML_PullParser uses internally constucted keys for attributes assigned to namespaces, when namespace support is in effect attribute names should not be addressed independently of the methods supplied to deal with attributes.

That is, when namespace support is not in effect, it's possible to extract individual attrribute names and values from the various arrays which supply attributes, using each and foreach . And where the attribute name is known, it's possible to get its value from $array[$name] . But when working with namespaces, XML_PullParser_getAttrVal should always be used. It has been updated to reflect the namespace code and is called exactly as before:
$parser->XML_PullParser_getAttrVal($attr_name, $attr_array);
$attr_name is the unqualified attribute name, i.e. it does not include either the namespace URI or the namespace prefix. To take an instance from the sample Movies document, if we wanted to get at the month attribute, we would use the following code:

    $date = $parser->XML_PullParser_getElement('date');
    $attr_array = $parser->XML_PullParser_getAttributes($date);   
    echo "Month:  " . $parser->XML_PullParser_getAttrVal('month', $attr_array) . "\n";

 /* 
 Result
 Month: Apr
 */

See the appendix for an example token created with namespace support. It helps to clarify the issues inovled in accessing attribute data.


Namespace Agreement
When a request if made for data, XML_PullParser tests for namespace agreement. In the case of attributes, if no agreement is found, then XML_PullParser_getAttrVal returns NULL . In the case of the character data, if no agreement is found, then the text methods will skip over the data. Their return values will reflect any absence of data in the ways appropriate to each method.

XML_PullParser uses the method _is_current_NS to determine namespace agreement. So, it is useful to look at how that method works.
  1. If no namespace definition has been set in XML_PullParser_setCurrentNS , or if the current namespace definition has been unset with XML_PullParser_unsetCurrentNS , then _is_current_NS returns True . In effect, all elements and attributes are deemed to be in agreement with current namespace definition, which is Null . After passing this test, the element or attribute in question is subject to the normal rules and constraints which govern the handling of elements and attributes.
  2. When a namespace definition has been set, this method returns True if the namespace of an element or attribute is found in the current namespace definition. Otherwise, it returns False.
  3. If there is a default namespace, the parser will apply it to all elements which have no explicit namespace prefix, and rule number 2 above will apply to them. 2

Below are some examples illustrating namespace agreement. They all refer to the file Movies.xml, from the listings directory of this distribution; it is reprinted for convenience on the next page of this manual.

Namespace Agreement 1
   $parser->XML_PullParser_setCurrentNS("http://room535.org/movies/title/|"
  . "http://room535.org/movies/mov/|http://room535.org/movies/star/|
  . "http://room535.org/movies/dates/");

    while($token = $parser->XML_PullParser_getToken()) {
        $title = $parser->XML_PullParser_getText('title');
        echo "Title: $title\n";
        $attr_vals = $parser->XML_PullParser_getAttrValues(array('date'=>$token));      
        echo "Month:  " . $parser->XML_PullParser_getAttrVal('month', $attr_vals[0]) . "\n";;
        echo "Day:  " . $parser->XML_PullParser_getAttrVal('day', $attr_vals[0]) . "\n";    
    }

 /*
  Result
	Title: Gone With The wind
	Month:  Apr
	Day:  25
	Title: How Green Was My Valley
	Month:
	Day:
	Title: Jurassic Park
	Month:  June
	Day:  15

 */


Namespaces and the $which Parameter
Namespaces are applied to an XML_Pullarser token only when data is requested, that is only when functions such as XML_Pullparser_getText and XML_Pullarser_getAttributes are called. The token returned by XML_Pullparser_getToken holds the parent element and all of its dependents, just as it would if namespace support were not in effect. Take, for instance, the following snippet:

 
  <ENTRY>
    <dns:server dns:ip="192.168.10.1">example_1.com</dns:server> 
    <dns_2:server dns_2ip="192.168.10.2">example_2.com</dns_2server> 
    <dns:server dns:ip="192.168.10.3">example_3.com</dns:server> 
  </ENTRY>
 

XML_Pullparser_getToken would return the complete <ENTRY> element, with all three server elements, regardless of the nameserver definition created in XML_PullParser_setCurrentNS . Therefore, if the namespace definition included the namespace represented by dns but not the one represented by dns_2 , the following call to XML_PullParser_getText would yield Null :

   $parser->XML_PullParser_getElement('server'); 
   $dns_server =  $parser->XML_PullParser_getText('server', 2);


Notes
1. The attribute arrays are derived from the same methods ennumerated under XML_PullParser_getAttr_NS above.
2. Default namespaces do not apply to attributes, which must have an explicit prefix to be included in a namespace.