PHP XML Parser

by Vincy. Last modified on July 14th, 2022.

PHP XML parser is used to perform event-based parsing on XMLs. This extension uses the Expat XML parser library to implement the event-based parser.

As like other core XML parsers of PHP, it also uses the libxml extension. XML parser supports ISO-8859-1, US-ASCII and UTF-8 character encoding.

Did you find something annoying with this? Yes, the name of this PHP extension itself is ‘XML Parser’. It is too generic and possible to get confused.

Just to be sure of what you are reading, you may have a look at the introduction to PHP parsers to get an idea about the parsers available.

This parser is faster than tree-based XML parsers (SimpleXML and DOM). Because it is not loading the entire XML document into memory to parse. Using this extension, we cannot validate the XML documents. If we pass an invalid format, then it will cause an error.

PHP XML Parser Functions

PHP XML parser extension includes several functions to access XML elements. Some of those functions are,

  • xml_parser_create(), xml_parser_create_ns() – To create create XML parser handler. xml_parser_create_ns() function creates parser handler with namespace support.
  • xml_parse(), xml_parse_into_struct() – To parse XML document. xml_parse_into_struct() is used to covert XML nodes into array.
  • xml_set_element_handler() – To set start and end element handlers for XML parser. There are more setters to set variety of handlers.
  • xml_get_current_line_number(), xml_get_current_column_number() – To get current column line number and column number respectively.
  • xml_parser_free() – To cancel parser handler reference if it is not required.

Handler Functions

In XML Parser extension, handler functions are defined to invoke a particular event. For example, the start and end element handlers of this parser are invoked on element start and end respectively.

For XML parser a start element handler must contain parameters like parser handler, element name, and its attribute array. The end element handler must contain the parser handler and element name.

After definition, we need to set these element handlers to XML parser by using xml_parser_element_handler().

Parsing XML with Start and End Element Handlers

In this example program, we define onStart() and onEnd() functions as element event handlers. While parsing XML using xml_parse() these handlers display XML node structure with line numbers using PHP print statement.

<?php
$xmlDocument = '<?xml version="1.0"?>
<toys>
<toy>
<name>Ben 10 Watch</name>
<type>Battery Toys</type>
</toy>
<toy>
<name>Angry Birds Gun</name>
<type>Mechanical Toys</type>
</toy>
</toys>';
$line_number = 0;

function onStart($parser, $name, $attributes)
{
    global $line_number;
    if ($line_number != xml_get_current_line_number($parser)) {
        $line_number = xml_get_current_line_number($parser) . ": ";
        $output = "<br/>" . $line_number . " \t<" . $name . ">";
    } else
        $output = "<" . $name . ">";
    echo $output;
}

function onEnd($parser, $name)
{
    global $line_number;
    if ($line_number != xml_get_current_line_number($parser)) {
        $line_number = xml_get_current_line_number($parser) . ": ";
        $output = "<br/>" . $line_number . " \t</" . $name . ">";
    } else
        $output = "</" . $name . ">";
    echo $output;
}
$parser = xml_parser_create();
xml_set_element_handler($parser, 'onStart', 'onEnd');
if (! xml_parse($parser, $xmlDocument, true)) {
    echo "<br/>Parse Error";
} else {
    echo "<br/>Parsing is done.";
}
?>

The output of this program is,

2: <TOYS>
3: <TOY>
4: <NAME></NAME>
5: <TYPE></TYPE>
6: </TOY>
7: <TOY>
8: <NAME></NAME>
9: <TYPE></TYPE>
10: </TOY>
11: </TOYS>
Parsing is done.

Download PHP XML Parser Source Code

Leave a Reply

Your email address will not be published. Required fields are marked *

↑ Back to Top

Share this page