eXcavator
An XML Query Facility for XML_PullParser
version 1.0.6
Myron Turner
Output Functions

Contents         

The examples in this manual are based on the XML document in the appendix.


Output Methods
There are four methods for retrieving and outputting the results of a query.
  1. string eXcavator_getResultAsString($html=False, $comment=False)
  2. string eXcavator_getResultAsXMLDoc()
  3. array eXcavator_getResultAsData()
  4. string eXcavator_getFormattedText($which, $element, $pattern)
    See manual page for Formatted Output

While the notion of a "result" may seem obvious, it is important to define precisely what a result is. One of the methods that we defined in the "Introduction" under helper methods was eXcavator_getResultCount, which returns the number of results found by the query. The results of a query are stored internally in an array, one element dedicated to each result. A result is a single instance of the data returned when a condition is placed in double square brackets. Therefore the query 'owner[[CDATA]]' would yield two results, one for each of the two vehicles in in the sample XML. Each of these results would get a seperate entry in the internal array. This array would yield a count of two when tested with PHP's count() function and any methods which refer to these entries using a $which value would expect them to be referred to as entries zero and one.The query 'owner:last_name[[CDATA]],first_name[[CDATA]]' would yield four results, two for each of the vehicles. Their indexing would reflect (1) the order in which the vehicles appear in the document and (2) the order of the result brackets:

[0] last_name vehicle 1, [1] first_name vehicle 1, [2] last_name vehicle 2, [3] first_name vehicle 2


eXcavator_getResultAsString($html=False, $comment=False)
This method will return the result as a string, formatted as XML, but the XML is not guaranteed to be well-formed. The reason for this is that not all output will necessarily have what will be recognized by validators as a root element. The output for the sample queries used eXcavator_getResultAsString to print out its results. To review that page click here.

If $html is set to TRUE, then the output will display correctly in a browser. If $comment is set to TRUE, then each result found will be preceded by a comment that contains the number of the result.
A result, as noted above, is the data returned for a condition which is in double brackets, as in this output for sample query 12:


12. owner:last_name[[Jones]], first_name[[CDATA]], street[[CDATA]]
 < !--           -- 1 --         -->
 < LAST_NAME>Jones < /LAST_NAME>

 < !--           -- 2 --         -->
 < FIRST_NAME MIDDLE_INIT = "J">Douglas < /FIRST_NAME>

 < !--           -- 3 --         -->
 < STREET>200 Winnipegosis Ave < /STREET>

To see the commented output for all the sample queries, click here.


eXcavator_getResultAsXMLDoc()
This method returns a string that will be recognized as well-formed XML. It prefaces the result with the standard XML declaration and inserts the output into a root element. So, the above example would appear as follows and is acceptable, well-formed XML:

 < ?xml version = "1.0"?>
 < __root__>

 < STREET>323 Oak Bay < /STREET>

 < STREET>200 Winnipegosis Ave < /STREET>

 < /__root__>

Among other uses for this output is that it could be fed back into an instance of XML_PullParser and parsed for specialized formatting or for storing in a file, etc. It's worth looking at an example:

Listing 2
    $eXc = new eXcavator($doc, eXcavator_STRING);
    $eXc->eXcavator_Query('owner[[CDATA]]');
    $result = $eXc->eXcavator_getResultAsXMLDoc();

       // free XML_PulParser's resources, since we are going to create a new XML_PullParser instance
    $eXc->eXcavator_free();

    $child_tags = array("Last_name", "First_name", "Street", "City", "Zip");
    $tags = array("owner");

    $parser = new XML_PullParser_doc($result,$tags,$child_tags);
    while($token = $parser->XML_PullParser_getToken()) {

      foreach($child_tags as $tag) {
      	outputData($tag,1);
            if($tag == "City") {
         	   outputData($tag,2);
            }

      }
     echo "\n";
    }

    function outputData($element, $which) {
      global $parser;
      $parser->XML_PullParser_getElement($element);
      $text = $parser->XML_PullParser_getText($element, $which);
      echo "$element: $text\n";
    }

    /*
 Result

Last_name: Taylor
First_name: Michael
Street: 323 Oak Bay
City: Winnipeg
City:
Zip: R3B 1B6

Last_name: Jones
First_name: Douglas
Street: 200 Winnipegosis Ave
City: St Adolphe
City: Winnipeg
Zip: R3L 1Z5
*/


We could achieve the same result using XML_PullParser on it own, but with a slightly more involved inner loop:

Listing 3

$child_tags = array("Last_name", "First_name", "Street", "City", "Zip", "owner");
$tags = array("vehicle");

$parser = new XML_PullParser_doc($doc,$tags,$child_tags);

while($token = $parser->XML_PullParser_getToken()) {

  $owner = $parser->XML_PullParser_getElement("owner");

  foreach($child_tags as $el) {

      if($el == 'owner') {   // if we don't exclude owner, all text in owner will be output in one
              continue;      // access to XML_PullParser_getText($child) in the outputData function;
      }
       $child = $parser->XML_PullParser_getChild($el,1);
       outputData($el,$child);
       if($el == "City") {
         $child = $parser->XML_PullParser_getChild($el,2);
         outputData($el,$child);
       }

 }
echo "\n";
}

function outputData($el,$child) {
  global $parser;

  $text = $parser->XML_PullParser_getText($child);
  echo "$el: $text\n";

}

There is not an overwhelmingly significant difference in the amount of coding between the two versions. But consider if the query were 'vehicle[color=>green]:owner[[CDATA]]'. To get our result without eXcavator , we would first have to check the color element for "green":

$parser->XML_PullParser_getElement("green");
$color = $parser->XML_PullParser_getText();
if(color == "green") {
  $parser->XML_PullParser_getElement("owner");
  //  the rest of the code goes here

}

With each added qualification, another check would be required. But the version which uses eXcavator would remain the same, however much additional complexity we added to the query, as long as the conditions in double brackets remain constant:
'vehicle[@year=2004]:color[green]:owner[[CDATA]]'
For data-base type structures eXcavator can offer an efficient and elegant approach to using XML_PullParser.


eXcavator_getResultAsData()
This method returns each result of the query as an associative array which can be passed to a function for further processing. The associative arrays are organized into a numerically indexed array, which stores each result in the order in which it was located. This order reflects both document order and the sequence of the result brackets. The document order has priority, so that if the query asked only for addresses, our example XML would yield four addresses: [0] the first dealer's address, [1] the first owner's address, [2] the second dealer's address, [3] the second owner's address.

The structure of the associative array is fairly straight-forward. The data for each XML element is stored under a key of the same name. The only deviation from this scheme occurs when there is more than one XML element of the same name. In these cases each element after the first is numbered, beginning with 1, and its number is set off from the element name by two undersores. Below is the first result for the query: owner:name[[CDATA]]. It has two City elements. The first key is CITY and the second is CITY__1. If there were another City element, it would be CITY__2, etc.

Again, as we can see from the example, each key in turn stores a two element associative array, with a cdata key and an attr key. The values of both cdata and attr are strings. So that the last name would be accessed as $result[0]['LAST_NAME']['cdata']; The middle initial would be accessed as: $result[0]['FIRST_NAME']['attr']. Where there is more than one attribute, the attributes are separated by a semi-colon:

[attr] => YEAR = "2004"; MAKE = "Acura"; MODEL = "3.2TL"

The appendix illustrates an entire vehicle array.


[0] => Array
        (
            [NAME] => Array
                (
                    [cdata] =>
                    [attr] =>
                )

            [LAST_NAME] => Array
                (
                    [cdata] => Taylor
                    [attr] =>
                )

            [FIRST_NAME] => Array
                (
                    [cdata] => Michael
                    [attr] =>  MIDDLE_INIT = "M"
                )

            [ADDRESS] => Array
                (
                    [cdata] =>
                    [attr] =>
                )

            [STREET] => Array
                (
                    [cdata] => 323 Oak Bay
                    [attr] =>
                )

            [APARTMENT] => Array
                (
                    [cdata] =>
                    [attr] =>
                )

            [CITY] => Array
                (
                    [cdata] => Winnipeg
                    [attr] =>
                )

            [CITY__1] => Array
                (
                    [cdata] =>
                )

            [ZIP] => Array
                 (
                    [cdata] => R3B 1B6
                    [attr] =>
                )

        )


eXcavator_getFormattedText($which, $element, $pattern)
See manual page for Formatted Output