PHP htmlentities()

by Vincy. Last modified on July 3rd, 2022.

Generally, the character encoding is done in various ways in PHP which provides many functions to perform these conversions, onto a given input string, from one form to another. For example, PHP urlencode()/decode() is used to convert the special characters occurred in an URL into %(Hex) format.

Another function nl2br(), which we have seen while discussing PHP line breaks, also performs conversion between actual line breaks to HTML line breaks.

Similarly, the PHP function htmlentities() is used to convert the special character that occurred in an input string into the form of HTML character entities. This kind of conversion is used to prevent the given input string containing special characters which may cause data truncation while sending them as an input of database query, URL and etc.

For example, if we have single quotes with an input string that is required to be embedded with a query, will cause a PHP error due to the incompletion of the query statement, that is truncated by the single quotes.

In such a situation, htmlentities() are used to prevent parsing special characters occurred with user input data.

HTML Character Entities

Before starting with the detailed description of the PHP htmlentities() function, let us see about HTML character entities. These entities start with ampersand(&) character followed by either name or number with which these entities are specified.

On using entity number, then the HTML entity will start with (&), followed by (#) and entity number.

For each character, there will be a corresponding character entity, that holds, a unique name and number. We can specify an HTML entity by using its name like &name, for example, the character < can be specified as &lt;.

On the other hand, with entity number specification, the same < character can be specified as &#60;.

htmlentities() Syntax, Parameters and Flag Constants

Now, let us have a glance at the basic syntax of PHP htmlentities() functions and some of the related PHP functions.

Syntax

This function accepts four arguments as shown in the following syntax.

string htmlentities ( string $input_string, int $flag, string $character_encoding, bool $double_encode )

Now the following list describes the arguments of htmlentities().

  • input_string – represents the character string input on which the special characters are going to be converted to HTML entities.
  • flag – This flag can have a set of available flag constants, and the default value for this argument is ENT_COMPAT | ENT_HTML401.
  • character_encoding – Any one of the available values for specifying character encoding techniques will be denoted for this argument, as, we have specified with PHP multi-byte string extract function mb_substr().
  • double_encode – This parameter will accept either, TRUE, to allow encoding for the HTML entity that took place with the given input string; or, FALSE, not to allow double encode. But the default is TRUE.

Flag Constants

The set of PHP constants combined together as the value of htmlentities() function’s flag parameter is listed below.

  • ENT_COMPAT – This is used to convert the double-quoted character that occurs in the input string, into the corresponding HTML entity, but leaves single quotes as it is. And this constant combines together with an ENT_HTML401 constant to be a default value for the flag parameter.
  • ENT_QUOTES – Unlike, ENT_COMPAT, it covers both single and double quotes of an input string.
  • ENT_NOQUOTES – It converts neither single nor double-quotes.
  • ENT_IGNORE – As per its name, it ignores the characters which are invalid among the input character string.
  • ENT_SUBSTITUTE – It is used to substitute alternate Unicode characters instead of invalid character occurred.

Apart from the above list of flag constants, there is a further list of remaining constant that works depending on the various types and versions of markup languages. These are,

  • ENT_HTML401
  • ENT_XML1
  • ENT_XHTML
  • ENT_HTML5

Example: Converting Special Character into HTML entities using PHP

<?php
$input_string = "PHP 'character string conversion' functions <i>htmlentities()</i>";
$output = htmlentities($input_string);
echo "<b>Original Character String</b><br/>";
echo $input_string . "<br/><br/>";
echo "<b>After Conversion</b><br/>";
echo $output;
?>

In the above program, we have an input string that includes single quotes and less than and greater than symbols which can be parsed by the browser.

So, before applying these inputs into htmlentities() function, if we print it to the browser, then, the browser will parse the HTML tags, <i></i>, and thereby displays the string htmlentities() with an italic font.

Rather, if we look into the result after applying input data to htmlentities(), then, we can see the <i></i> tags on the browser display. But still, the single quotes characters are kept as it is in the browser display since the flag constant is having ENT_COMPAT value by default which will not allow converting single quotes into HTML character entity.

Related Functions for PHP htmlentities()

The encoded string input can be reverted back to its original form by using the PHP function named html_entity_decode(). So, when we send the value of a PHP variable $output in the above program, to the  html_entity_decode() function, then we can get the original form of the input string.

For example, we should add the following lines to check the output returned by the html_entity_decode() function

<?php
$decoded_output = html_entity_decode($output);
echo "<b>After Decode html entity</b><br/>";
echo $decoded_output;
?>

Like htmlentities(), PHP provides another function, named, htmlspecialchars(), which is also used for the same purpose of changing the special character into the form of HTML entities.

But the difference between htmlspecialchars() and htmlentities() is, htmlspecialchars() can convert a limited set of special characters, that is, less than (<), greater than (>), single quotes (’), double quotes (”) and ampersand (&), into their corresponding HTML entities, whereas, htmlentities() functions will convert all special character into its entity form.

For example, let us examine these two methods, with the following PHP program to convert the input string that includes the copyright symbol (©).

<?php
$input_string = "HTML © symbol";
echo "htmlspecialchars() returns<br/><br/>";
echo $output = htmlspecialchars($input_string) . "<br/><br/>";
echo "htmlentities() returns<br/><br/>";
echo $output = htmlentities($input_string);
?>

While executing the above program, we can see with the source view of browser output, how the htmlentities() function convert the © symbol as ©, and the htmlspecialchars() replace this character as ©.

Leave a Reply

Your email address will not be published. Required fields are marked *

↑ Back to Top

Share this page