PHP htmlentities()

Generally, character encoding is done in various ways in PHP which provides many functions to perform these conversions, onto a given input string, from one form to another. For example, PHP urlencode()/decode() is used to convert the special characters occurred in an URL into %(Hex) format. Another function nl2br(), we have seen while discussing about PHP line breaks, also performs conversion between actual line breaks to HTML line breaks.

Similarly, the PHP function htmlentities() is used to convert the special character that occurred in an input string into the form of html character entities. These kind of conversion is used to prevent the given input string containing special characters which may cause data truncation while sending them as an input of database query, URL and etc.

For example, if we have single quotes with an input string which is required to be embedded with an query, will cause PHP error due to the incompletion of the query statement, that is truncated by the single quotes. In such situation htmlentities() are used to prevent parsing special characters occurred with user input data.

php_htmlentities

HTML Character Entities

Before starting with the detailed description of PHP htmlentities() function, let us see about HTML character entities. These entities are start with ampersand(&) character followed by either name or number with which these entities are specified. On using entity number, then the HTML entity will start with (&), followed by (#) and entity number.

For each character, there will be corresponding character entity, that holds, unique name and number. We can specify html entity by using its name like &name, for example, the character < can be specified as &lt;. On the other hand, with entity number specification, the same < character can be specified as &#60;.

htmlentities() Syntax, Parameters and Flag Constants

Now, let us have a glance among the basic syntax of PHP htmlentities() functions and some of the related PHP functions.

Syntax

This function accepts four arguments as shown in the following syntax.

string htmlentities ( string $input_string, int $flag, string $character_encoding, bool $double_encode )

Now the following list describe about the arguments of htmlentities().

  • input_string – represents the character string input on which the special characters are going to be converted in HTML entities.
  • flag – This flag can have set of available flag constants, and the default value for this argument is ENT_COMPAT | ENT_HTML401.
  • character_encoding – Any one of the available values for specifying character encoding techniques will be denoted for this argument, like, we have specified with PHP multi byte string extract function mb_substr().
  • double_encode – This parameter will accept either, TRUE, to allow encoding for the HTML entity took place with the given input string; or, FALSE, not to allow double encode. But the default is TRUE.

Flag Constants

The set of PHP constants combined together as the value of htmlentities() function’s flag parameter is listed below.

  • ENT_COMPAT – This is used to convert the double quoted character that occur in input string, into the corresponding HTML entity, but leaves  single quotes as it is. And this constant combines together with ENT_HTML401 constant for being a default value for the flag parameter.
  • ENT_QUOTES – Unlike, ENT_COMPAT, it coverts both single and double quotes of an input string.
  • ENT_NOQUOTES – It converts neither single nor double quotes.
  • ENT_IGNORE – As per its name, it ignores the characters which are invalid among the input character string.
  • ENT_SUBSTITUTE – It is used to substitute alternate unicode character instead of invalid character occurred.

Apart from the above list of flag constants, there is further list of remaining constant that works depends on the various types and versions of markup languages. These are,

  • ENT_HTML401
  • ENT_XML1
  • ENT_XHTML
  • ENT_HTML5

Example: Converting Special Character into HTML entities using PHP

<?php
$input_string="PHP 'character string conversion' functions <i>htmlentities()</i>";
$output = htmlentities($input_string);
echo "<b>Original Character String</b><br/>";
echo $input_string."<br/><br/>";
echo "<b>After Conversion</b><br/>";			
echo $output;
?>

In the above program, we have input string which includes single quotes and less than and greater than symbols which can be parse by the browser. So, before applying these input into htmlentities() function, if we print it to the browser, then, the browser will parse the html tags, <i></i>, and there by displays the string htmlentities() with italic font.

Rather, if we look into the result after applying input data to htmlentities(), then, we can see the <i></i> tags on browser display. But still, the single quotes characters are kept as it is in the browser display, since the flag constant is having ENT_COMPAT value by default which will not allow to convert single quotes into HTML character entity.

Related Functions for PHP htmlentities()

The encoded string input can be reverted back to its original form by using PHP function named html_entity_decode(). So, when we send the value of a PHP variable $output in the above program, to the  html_entity_decode() function, then we can get the original form of the input string. For example, we should add the following lines to check the output returned by the html_entity_decode() function

...
$decoded_output = html_entity_decode($output);
echo "<b>After Decode html entity</b><br/>";
echo $decoded_output;

As like as htmlentities(), PHP provides another function, named as, htmlspecialchars(), which is also used for the same purpose of changing special character into the form of HTML entities.

But the difference between htmlspecialchars() and htmlentities() is, htmlspecialchars() can convert limited set of special characters, that is, less than (<), greater than (>), single quotes (’), souble quotes (”) and ampersand (&), into their corresponding html entities, where as, htmlentities() functions will convert all special character into its entity form.

For example, let us examine these two methods, with the following PHP program to convert the input string that includes copy right symbol (©).

$input_string="HTML © symbol";
echo "htmlspecialchars() returns<br/><br/>";
echo $output = htmlspecialchars($input_string) . "<br/><br/>";
echo "htmlentities() returns<br/><br/>";
echo $output = htmlentities($input_string);

While executing the above program, we can see with the source view of browser output, how the htmlentities() function convert the © symbol as ©, and the htmlspecialchars() replace this character replace this character as ©.

This PHP code tutorial was published on July 5, 2013.

↑ Back to Top