Manipulating String Values

Manipulating String Values

The Word Search program featured at the beginning of this chapter uses arrays to do some of its magic, but arrays alone are not sufficient to handle the tasks needed for this program. The word search program takes advantage of a number of special string manipulation functions to work extensively with text values. PHP has a huge number of string functions that give you an incredible ability to fold, spindle, and mutilate string values.

Demonstrating String Manipulation with the Pig Latin Translator

As a context for describing string manipulation functions, consider the program featured in Figures 5.10 and 5.11. This program allows the user to enter a phrase into a text box and converts the phrase into a bogus form of Latin.

Click To expand
Figure 5.10: The pigify program lets the user type some text into a text area.
Click To expand
Figure 5.11: The program translates immortal prose into incredible silliness.
HINT?

If you're not familiar with pig Latin, it's a silly kid's game. Essentially, you take the first letter of each word, move it to the end of the word, and add" ay." If the word begins with a vowel, simply end the word with" way."

The pigify program will use a number of string functions to manipulate the text.

<!doctype html public "-//W3C//DTD HTML 4.0 //EN">
<html>
<head>
       <title>Pig Latin Generator</title>
</head>
<body>
<h1>Pig Latin Generator</h1>
<?
if ($inputString == NULL){
  print <<<HERE
  <form>
  <textarea name = "inputString"
            rows = 20
            cols = 40></textarea>
   <input type = "submit"
          value = "pigify">
   </form>

HERE;
} else {
  //there is a value, so we'll deal with it

  //break phrase into array
  $words = split(" ", $inputString);
  foreach ($words as $theWord){
    $theWord = rtrim($theWord);
    $firstLetter = substr($theWord, 0, 1);
    $restOfWord = substr($theWord, 1, strlen($theWord));
    //print "$firstLetter) $restOfWord <br> \n";
    if (strstr("aeiouAEIOU", $firstLetter)){
      //it's a vowel
      $newWord = $theWord . "way";
    } else {
      //it's a consonant
      $newWord = $restOfWord . $firstLetter . "ay";
    } // end if
    $newPhrase = $newPhrase . $newWord . " ";
  } // end foreach
  print $newPhrase;

} // end if

?>


</body>
</html>

Building the Form

This program uses a PHP page to create an input form and to respond directly to the input. It begins by looking for the existence of the $inputString variable. This variable will not exist the first time the user gets to the page. In this situation, the program will build the appropriate HTML page and await user input. After the user hits the Submit button, the program will run again, but this time there will be a value in the $inputString variable. The rest of the program uses string manipulation functions to create a pig Latin version of the input string.

Using the Split Function to Break a String into an Array

One of the first tasks is to break the entire string that comes from the user into individual words. PHP provides a couple of interesting functions for this purpose. The split() function takes a string and breaks it into an array based on some sort of delimiter. The split() function takes two arguments. The first argument is a delimiter and the second is a string to break up. I want each word to be a different element in the array, so I use space ("") as a delimiter. The following line takes the $inputString variable and breaks it into an array called $words. Each word will be a new element of the array.

$words = split(" ", $inputString);

Once the $word array is constructed, I stepped through it with a foreach loop. I stored each word temporarily in $theWord inside the array.

Trimming a String with rtrim()

Sometimes when you split a string into an array, each element of the array will still have the split character at the end. In the pig Latin game, there will be a space at the end of each word, which can cause some problems later. PHP provides a function called rtrim() which automatically removes spaces, tabs, newlines, and other white space from the end of a string. I used the rtrim() function to clean off any trailing spaces from the split() operation, and returned the results back to $theWord.

$theWord = rtrim($theWord);

TRICK?

In addition to rtrim(), PHP has ltrim(), which trims excess white space from the beginning of a string, and trim(), which cleans up both ends of a string. Also, there's a variation of the trim commands that allows you to specify exactly which characters are removed.

Finding a Substring with substr()

The behavior of the algorithm depends on the first character of each word. I'll also need to know all the rest of the word without the first character. The substr() function is useful for getting part of a string. It requires three parameters. The first argument is the string you want to get a piece from. The second parameter is which character you want to begin with (starting with zero as usual), and the third parameter is how many characters you want to extract.

I got the first letter of the word with this line:

$firstLetter = substr($theWord, 0, 1);

It gets one letter from $theWord starting at the beginning of the word (position 0). I then stored that value in the $firstLetter variable.

It's not much more complicated to get the rest of the word:

$restOfWord = substr($theWord, 1, strlen($theWord) -1);

Once again, I need to extract values from $theWord. This time, I'll begin at character 1 (which humans would refer to as the second character). I don't know directly how many characters to get, but I can calculate it. I should grab one less character than the total number of characters in the word. The strlen() function is perfect for this operation, because it returns the number of characters in any string. I can calculate the number of letters I need with strlen($theWord) - 1. This new decapitated word is stored in the $restOfWord variable.

Using strstr() to Search for One String Inside Another

The next task is to determine if the first character of the word is a vowel. There are a number of approaches to this problem, but perhaps the easiest is to use a searching function. I created a string with all the vowels ("aeiouAEIOU") and then I searched for the existence of the $firstLetter variable in the vowel string. The strstr() function is perfect for this task. It takes two parameters. The first parameter is the string you are looking for (given the adorable name "haystack" in the online documentation). The second parameter is the string you are searching in (called the "needle"). To search for the value of the $firstLetter variable in the string constant "aeiouAEIOU", I used the following line:

if (strstr("aeiouAEIOU", $firstLetter)){

The strstr() function returns the value FALSE if the needle was not found in the haystack. If the needle was found, it returns the position of the needle in the haystack parameter. In this case, all I'm really concerned about is whether $firstLetter is found in the list of variables. If so, it's a vowel, which will change the way I modify the word.

Using the Concatenation Operator

Most of the time in PHP you can use string interpolation to combine string values. However, sometimes you still need to use a formal operation to combine strings. The process of combining two strings is called concatenation. (I love it when simple ideas have complicated names.) The concatenation operator in PHP is the period (.). In pig Latin, if a word begins with a vowel, it should simply end with the string "way." I used string concatenation to make this work.

$newWord = $theWord . "way";

When the word begins with a consonant, the formula for creating the new word is slightly more complicated, but is still performed with string concatenation.

$newWord = $restOfWord . $firstLetter . "ay";

TRICK?

Recent testing has shown that the concatenation method of building strings is dramatically faster than interpolation. If speed is an issue, you might want to use string concatenation rather than string interpolation.

Finishing Up the Pig Latin Program

Once I created the new word, I added it and a trailing space to the $newPhrase variable. When the foreach loop has finished executing, $newPhrase will contain the pig Latin translation of the original phrase.

Translating Between Characters and ASCII Values

Although it isn't necessary in the pig Latin program, the word search program will require the ability to randomly generate a character. I'll do this by randomly generating an ASCII value (ASCII is the code used to store characters as binary numbers in the computer's memory) and translating that number to the appropriate character. The ord() function is useful in this situation. The upper case letters are represented in ASCII by numbers between 65 and 90. To get a random upper-case letter, I can use the following code:

$theNumber = random(65, 90);
$theLetter = ord($theNumber);