PHP XML Parser

PHP XML parser is used to perform event-based parsing on XMLs. This extension uses Expat XML parser library to implement event based parser. As like other core XML parsers of PHP, it also uses libxml extension. XML parser supports ISO-8859-1, US-ASCII and UTF-8 character encoding.

Did you find something annoying with this? Yes, name of this PHP extension itself is ‘XML Parser’. It is too generic and possible to get confused. Just to be sure of what you are reading, you may have a look at the introduction to PHP parsers to get an idea about the parsers available.

This parser is faster than tree-based XML parsers (SimpleXML and DOM). Because it is not loading entire XML document into memory to parse. Using this extension, we cannot validate XML document. If we pass invalid  format, then it will cause error.

parser

PHP XML Parser Functions

PHP XML parser extension includes several functions to access XML elements. Some of those functions are,

  • xml_parser_create(), xml_parser_create_ns() – To create create XML parser handler. xml_parser_create_ns() function creates parser handler with namespace support.
  • xml_parse(), xml_parse_into_struct() – To parse XML document. xml_parse_into_struct() is used to covert XML nodes into array.
  • xml_set_element_handler() – To set start and end element handlers for XML parser. There are more setters to set variety of handlers.
  • xml_get_current_line_number(), xml_get_current_column_number() – To get current column line number and column number respectively.
  • xml_parser_free() – To cancel parser handler reference if it is not required.

Handler Functions

In XML Parser extension, handler functions are defined to invoke on particular event. For example, start and end element handlers of this parser are invoked on element start and end respectively.

For XML parser a start element handler must contain parameters like parser handler, element name and its attribute array. End element handler must contain parser handler and element name.

After definition, we need to set these element handlers to XML parser by using xml_parser_element_handler().

Parsing XML with Start and End Element Handlers

In this example program, we define onStart() and onEnd() functions as element event handlers. While parsing XML using xml_parse() these handlers display XML node structure with line numbers using PHP print statement.

<?php 
$xmlDocument = '<?xml version="1.0"?>
<toys>
<toy>
<name>Ben 10 Watch</name>
<type>Battery Toys</type>
</toy>
<toy>
<name>Angry Birds Gun</name>
<type>Mechanical Toys</type>
</toy>
</toys>';
$line_number = 0;
function onStart($parser,$name,$attributes) {
global $line_number;
if($line_number != xml_get_current_line_number($parser)) {
$line_number = xml_get_current_line_number($parser) . ": ";
$output = "<br/>" . $line_number . " \t<" . $name . ">";
} else 
$output = "<" . $name . ">";
echo $output;	
}
function onEnd($parser,$name) {
global $line_number;
if($line_number != xml_get_current_line_number($parser)) {
$line_number = xml_get_current_line_number($parser) . ": ";
$output = "<br/>".$line_number . " \t</" . $name . ">";
} else 
$output = "</" . $name . ">";
echo $output;	
}
$parser = xml_parser_create();
xml_set_element_handler($parser, 'onStart', 'onEnd');
if (!xml_parse($parser, $xmlDocument, true)) {
echo "<br/>Parse Error";
} else {
echo "<br/>Parsing is done.";
}
?>

The output of this program is,

2: <TOYS>
3: <TOY>
4: <NAME></NAME>
5: <TYPE></TYPE>
6: </TOY>
7: <TOY>
8: <NAME></NAME>
9: <TYPE></TYPE>
10: </TOY>
11: </TOYS>
Parsing is done.

Download PHP XML Parser Source Code

This PHP code tutorial was published on November 18, 2013.

↑ Back to Top