XML_PullParser
A token-based interface to the PHP expat XML library
version 1.3.2
Myron Turner
Attribute Accessors

Contents         

There are four attribute accessors in XML_PullParser:

  1. array XML_PullParser_getAttributes (mixed $name, [mixed $which = 1], [array $el = ""])
  2. string XML_PullParser_getAttrVal (string $name, array $attr_array)
  3. array XML_PullParser_getAttrValues (array $ar)
  4. array XML_PullParser_nextAttr ()


1. XML_PullParser_getAttributes
2. XML_PullParser_getAttrVal
These two companion methods have been treated elsewhere in the manual. See Introduction to Coding 2: Adding Attributes and Coding Strategies 2: the 'which' parameter. They have also been used in many example listings throughout the manual. 1


3. XML_PullParser_getAttrValues
The method takes as a parameter an associative array consisting of a single array element of the following form:

array("element_name"=>$parent) OR array("element_name"=>"parent")

The key, which is a string, is the name of one or more child elements that hold the attributes being sought. The value is the parent to these elements, and is either a tokenized array or a string that names the parent. An example will make all this a lot clearer.

Example 1
<ENTRY>
<ipaddress>172.20.19.6  </ipaddress>
<domain> example.com  </domain>
<server ip="192.168.10.1" registrant="example.com"> example_1.com  </server>
<server ip="192.168.10.2"> example_2.com  </server>
<server ip="192.168.10.3"> example_3.com  </server>
<alias> www.example.com  </alias>
</ENTRY>

Let's assume the following method call:

$attr_array = $parser->XML_PullParser_getAttrValues(array("server"=>"Entry"));

The resulting $attr_array would be as follows:

Array
(
    [0] => Array
        (
            [IP] => 192.168.10.1
            [REGISTRANT] => example.com
        )

    [1] => Array
        (
            [IP] => 192.168.10.2
        )

    [2] => Array
        (
            [IP] => 192.168.10.3
        )

)

Each array element of $attr_array holds the attribute date for one of the server elements. Accesing these elements is a simple matter of looping through the array, as in the following code listing:

Listing 19
 1.   $parser = new XML_PullParser_doc($doc,array("Entry"),array());
 2.   while($token = $parser->XML_PullParser_getToken()) {
 3.    //    $attributes = $parser->XML_PullParser_getAttrValues(array("server"=>"Entry"));
 4.        $attributes = $parser->XML_PullParser_getAttrValues(array("server"=>$token));
 5.        foreach($attributes as $attr) {
 6.            foreach($attr as $attr_name => $attr_value) {
 7.                echo "$attr_name => $attr_value\n";
 8.            }
 9.          echo "\n";
10.        }
11.    }

/*
  Result
        IP => 192.168.10.1
        REGISTRANT => example.com

        IP => 192.168.10.2

        IP => 192.168.10.3
*/

Instead of using the double string parameter (line 3), we use the alternate parameter type: array("server"=>$token). The result shows that all the attributes have been found and that where there's more than one attribute in a server element, the attributes for that element, IP and REGSISTRANT, are kept together, making it possible to identify the element which has two attributes.


4. XML_PullParser_nextAttr
This method returns the attribute(s) from the next element on the attribute loop stack, which is created using one of the following two methods:

array XML_PullParser_setAttrLoop ([array $el = ""], [boolean $assignText = false])

array XML_PullParser_setAttrLoop_elcd() ([array $el = ""])

Both of these take as a parameter an optional tokenized array. If this parameter is not passed in, then they will use the $current_element or, if that's not available, the current token. The second parameter to XML_PullParser_setAttrLoop is for internal use only. Calls to XML_PullParser_setAttrLoop_elcd are passed on to XML_PullParser_setAttrLoop after pre-processing, and the boolean $assignText signals this fact.

XML_PullParser_setAttrLoop captures the name of the element and its attributes. XML_PullParser_setAttrLoop_elcd captures, in addition to name and attributes, any character data assigned to the element, hence the suffix _elcd. Both of these methods create the same data structure, except that in the case of XML_PullParser_setAttrLoop, the field holding the element's character data is set to the empty string. A data unit based on the first server element in Example 1 above would be the following array:

 Array
  (
   [0] => SERVER
   [1] => Array
       (
           [IP] => 192.168.0.1
           [REGISTRANT] => example.com
       )
   [2] => example_1.com  OR  ""
  )

The complete data structure is a numerically indexed array of these arrays, and so technically they can be accessed in a loop that peels off one of these data units with each iteration. In effect, that's what XML_PullParser_nextAttr does: it returns the next data unit and updates an internal index. When it comes to the end of the array, it returns a False value and so when used in a loop that tests for this event, the loop comes to an end. The internal index can be reset to zero by calling:

void XML_PullParser_resetAttrLoopPtr ()

Perhaps the most distinct advantage of using XML_PullParser_nextAttr is that it does keep its own internal index. Therefore, it is always guaranteed to return the next data unit from the array. This could be useful when it is not being used in a loop and in situations where it is inconvenient to keep track of the current index in one's own code. Following is a sample listing that uses Example 1 above.

Listing 20
 1.   $parser = new XML_PullParser_doc($doc,$tags,$child_tags);
 2.
 3.   while($token = $parser->XML_PullParser_getToken())
 4.    {
 5.       $parser->XML_PullParser_getElement('server');
 6.       $parser->XML_PullParser_setAttrLoop();
 7.
 8.       while($attr = $parser->XML_PullParser_nextAttr()) {
 9.           foreach($attr[1] as $attr_name => $attr_value) {
10.                echo "$attr[0]: $attr_name => $attr_value\n";
11.            }
12.        echo "\n";
13.        }
14.    }

/*
    Result
        SERVER: IP => 192.168.10.1
        SERVER: REGISTRANT => example.com

        SERVER: IP => 192.168.10.2

        SERVER: IP => 192.168.10.3
*/

Line 5 calls XML_PullParser_getElement requesting the server elements. When XML_PullParser_setAttrLoop is called it finds the $current_element , in which XML_PullParser_getElement has stored the servers, and uses that for its search. Had it not found the $current_element, it would have used the current token. The result would have been the same, because none of the other elements have attributes. But let's assume the ipaddress element had this form:

          < ipaddress type="primary">172.20.19.6  < /ipaddress>

In this case, if ther current token were ued, there would be an additional data unit in the array and the Result would reflect this:

Array
        (
            [0] => IPADDRESS
            [1] => Array
                (
                    [TYPE] => primary
                )

            [2] =>
        )

/*
  Result
        IPADDRESS: TYPE => primary

        SERVER: IP => 192.168.10.1
        SERVER: REGISTRANT => example.com

        SERVER: IP => 192.168.10.2

        SERVER: IP => 192.168.10.3
*/

Finally, had we called XML_PullParser_setAttrLoop_elcd in line 6, our Result would have looked like this, where the data in square brackets is the text which was found in each of the elements:

        IPADDRESS [172.20.19.6]: TYPE => primary

        SERVER [example_1.com]: IP => 192.168.10.1
        SERVER [example_1.com]: REGISTRANT => example.com

        SERVER [example_2.com]: IP => 192.168.10.2

        SERVER [example_3.com]: IP => 192.168.10.3

These methods should prove useful for excavating attributes and locating element data that is identified by attributes with particular name and values.

Notes
1. See code listings: 3, 4, 9, 10, 11, 12, and 13