Useful PHP String Functions

PHP has a rather large number of string functions. (They are pretty boring on their own — the fun always comes when you’re already in deep coding trance). On my local machine alone, Adobe DreamWeaver CS5’s O’REILLY PHP Pocket Reference lists around 80 string functions. These are already exhausting to memorize. I probably won’t use all of them in a PSD to WordPress conversion project but it is nice to know what tools I have at my disposal.

I’ve covered a small fraction of PHP’s commonly used string manipulation functions in this article.  Topics about regular expression functions or alternatives to these functions are discussed in another article. I tried to keep my examples as accurate as possible by testing as I write. The codes here are tested on a 64-bit Windows 7 machine running XAMPP 1.7.7 with PHP 5.3.8.

Determining String Length

Suppose I want to determine the length of one word. Function strlen() does the job by simply taking the entire input string and returning the integer length. It’s prototype isint strlen(string $string). The mnemonic for remembering string functions is pretty simple: “str” at the beginning of functions means “string”.

<?php
     $string = "Supercalifragilisticexpialidocious";
     echo strlen($string); // prints 34
?>

Comparing Two Strings for Equality

PHP offers four functions for determining if two given strings are equal: strcmp(), strcasecmp(), strspn(), strcspn (). These comparisons are binary-safe, meaning that the inputs are treated as a stream of raw data without any format.  (These functions are thus able to work on all 256 possible values of any 8-bit character.) Function strcmp() returns one of three posible values, namely -1, 0, and 1 based on the comparison outcome:The functions strcmp() and strcasecomp()are both used to test for string equality. The only difference between the two is that the first is case sensitive and the latter is not. Both functions accept two string values as arguments, and both have integer return values.

  • -1 if $str1 is less than $str2
  •  0 if $str1 is equal to $str2
  • +1 if $str1 is greater than $str2
<?php
       $string1 = "Doms";
       $string2 = "doms";

       #Prints "Strings do not match!"
       if (strcmp($string1, $string2) == 0) {
	      echo "Strings match! <br/>";
       } else {
	      echo "Strings do not match!<br/>";
       }
?>

Function strcasecmp() on the other hand, has a slight variation when it comes to return values. While this function returns 0 if $str1 is equal to $str2, it will simply return a calculated negative value if $str1 is less than $str2 and a calculated positive value if otherwise. This happens when we set both inputs to be entirely different strings. For example, we can set $str1 = "ASDF" and $str2 = "1234". By calculated, I mean that the value depends on the string arguments given.

<?php
     $string1 = "Doms";
     $string2 = "doms";

     #Prints "Strings match!"
     if (strcasecmp($string1, $string2) == 0) {
     	   echo "Strings match! <br/-->";
     } else {
	   echo "Strings do not match!";
     }
?>

Two other functions, strspn() and strcspn() are used to compare strings for similarity rather than equality. Function strspn() returns the length of the first segment of characters in one string found in another string while function strcspn returns the length of the first segment of characters in one string which is not in another string.

Manipulating String Case

There are four functions that PHP offers to manipulate string case: strtolower(), strtoupper(), ucfirst(), ucword(). “uc” in functions ucfirst() and ucword() means “uppercase”. Function strtolower() simply converts all characters of a string to lowercase, whereas strtoupper() converts all to uppercase. Their prototypes are string strtolower(string $str) and string strtoupper(string $str).

<?php

# Define string input.
$sentence = "We rock at PSD to HTML, PSD to WP, and HTML to WP conversion!";

# Transform string to lowercase.
# Prints "we rock at psd to html,
# psd to wp, and html to wp conversion!"
echo strtolower($sentence);

#Transform string to uppercase
# Prints "WE ROCK AT PSD TO HTML,
# PSD TO WP, AND HTML TO WP CONVERSION!"
echo strtoupper($sentence);
?>

Function ucfirst() simply converts the first letter of a string to uppercase, while ucwords() converts the first letter of each word in a string to uppercase. Their prototypes are string ucfirst(string $str) and string ucwords(string $str).

<?php

# Define string input.
$sentence = "the html guys are great programmers.";

# Transform first letter to uppercase
# Prints "The html guys are great programmers."
echo ucfirst($sentence);

# Transform first letter of each word to uppercase
# Prints "The Html Guys Are Great Programmers."
echo ucwords($sentence);
?>

Converting Strings to and from HTML

Often we’ll encounter a problem of converting a string to HTML, or conversely. Six functions come in handy for this task: nl2br(), htmlentities(), get_html_translation_table(), strtr().

Function nl2br() is used for converting newline characters (i.e. “n”) to breaks (i.e “<br />”). It’s prototype is string nl2br(string $string [, bool $is_xhtml = true]). The optional parameter $is_xhtml specifies whether the break tag is XHTML compatible or not.

<?php

# Define string input.
$sentence = "I am a sentence.nI am another sentence.";

# Prints "I am a sentence. I am another sentence.";
echo $sentence;

# Prints "I am a sentence. <br /> I am another sentence.";
echo nl2br($sentence, true);

# Prints "I am a sentence. <br> I am another sentence.";
echo nl2br($sentence, false);
?>

Function htmlentities() is used to convert character entities (e.g., &) into their HTML equivalents (e.g., &amp;). It’s prototype is string htmlentities(string $string [,int $quote_style = ENT_COMPAT [, string charset [,bool double_encode = true]]]).

<?php
#Prints "Joseph &amp; Dominique are developers." in the HTML source code.
echo htmlentities("Joseph & Dominique are developers.");
?>

Note that the example above will work if the default charset in your PHP configuration file is set to “UTF-8”. By default it is set to “ISO-8859-1” Otherwise, the example above can be rewritten as show below.

<?php
#Prints "Joseph &amp; Dominique are developers." in the HTML source code.
echo htmlentities("Joseph & Dominique are developers.",ENT_COMPAT,"UTF-8");
?>

In the above example, “UTF-8” is given as the value for the optional $charset parameter. Note ENT_COMPAT in the above example. This is the value given to the $quote_style parameter, which accepts one of three values:

ENT_COMPAT: Encodes double quotes and ignores single quotes.
ENT_NOQUOTES: Ignores both double and single quotes.
ENT_QUOTES: Encodes both double and single quotes.

If you look at the function prototype again, you’ll notice that there is fourth parameter $double_encode which accepts a Boolean argument. This optional parameter specifies whether encoded HTML character entities present in the string should be encoded or left as is. For example, if $double_encode is set to true, an input of “&amp;” will be output as “&amp;amp;”, else this will be output as “&amp;”. The default value is true.

<?php
#Prints "Joseph &amp;amp; Dominique are developers." in the HTML source code.
echo htmlentities("Joseph & Dominique are developers.",ENT_COMPAT,"UTF-8");

#Prints "Joseph &amp;amp; Dominique are developers." in the HTML source code.
echo htmlentities("Joseph &amp; Dominique are developers.",ENT_COMPAT,"UTF-8",true);

#Prints "Joseph &amp; Dominique are developers." in the HTML source code.
echo htmlentities("Joseph &amp; Dominique are developers.",ENT_COMPAT,"UTF-8",false);
?>

Functions htmlspecialchars() plays a similar role as htmlentities(), except that the first can only convert character that have special meaning to HTML as a markup language. These characters are the less than symbol (<), the greater than symbol (>), the ampersand (&), single quotes (‘), and double quotes (“). It’s prototype is htmlspecialchars(string $string [,int $quote_style = ENT_COMPAT [, string charset [,bool double_encode = true]]] ), which is exactly the same as that of htmlentities().

<?php

# Prints "<p>I am a paragraph in HTML.</p>" in the HTML source code.
echo htmlspecialchars("<p>I am a paragraph in HTML.</p>", ENT_COMPAT, "UTF-8");

?>

Function get_html_translation_table(), returns an array of one of two translation tables: HTML_SPECIALCHARS and HTML_ENTITIES. These tables are simple associative arrays of character entities and their HTML equivalents. It’s prototype is array get_html_translation_table([int $table = HTML_SPECIALCHARS [,int $quote_style = ENT_COMPAT]]).

<?php

#Prints "Array ( [&] => & ["] => " [<] => < [>] => > )"
print_r(get_html_translation_table());

?>

In the example above, the function get_html_translation_table() was called without any parameters since all the functions parameters are options. Alternatively, we can obtain another translation table by changing the first optional parameter to HTML_ENTITIES, which returns a larger array.

<?php

#Prints a rather huge array of HTML character entities.
print_r(get_html_translation_table(HTML_ENTITIES));

?>

Function strtr() is used to convert all characters in a string to their corresponding match in a user defined array. The function is so named to be easily memorized: “tr” in the function name means “translate”. Unlike the previous functions mentions, these function has two prototypes:

string strtr(string $str, string $from, string $to);
string strtr( string $str , array $replace_pairs );

Following the first prototype of strtr(), note that there are three required arguments to be passed. The first, $str, is the string input that needs to be translated. The second, $from, is the set of strings that will be searched in the input string. The third one, $to, is the set of string that will serve as a replacement for the characters found in $from.

<?php

#Prints "I ip i baad pit".
echo strtr("I am a good man.","goman","bapit");

?>

What the function does above is it replaces all instances of the letter “g” with “b”, “o” with “a”, “m” with “p”, and so on. In other words, it takes the nth character in $from, searches $str for that character, and if found replaces that with the nth character in $to.

Following the second prototype, we can simply define an associate array of “from-to” pairs that the function can use as a translation table.

<?php

#Prints "<strong>I am bold.</strong>"
$translation_table = array("<b>" => "<strong>", "</b>" => "</strong>");
echo strtr("<b>I am bold.</b>",$translation_table);

?>

Function strip_tags() is used to remove markup and script tags from a string input. Essentially, it converts a string of HTML into a plain text. It’s prototype is string strip_tags(string $str [, string $allowable_tags]).

<?php

#Prints "I am a paragraph" in the HTML source code.
echo strip_tags("<p><em>I</em> am a paragraph</p>");

?>

Alternatively we can set the optional $allowable_tags to specify which tags should stay in the output.

<?php

#Prints "I <em>am</em> a paragraph" in the HTML source code.
echo strip_tags("<p><em>I</em> am a paragraph</p>","<em>");
?>

So there we have it. I’ve kept the example as easy to follow as possible. If you are already familiar with the basics of string, you may want to check out our article on regular expression functions and alternative functions.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

BACK TO TOP