Previous: 11.2.2 String Manipulation: strlen() and strcpy()
Up: 11.2 Library String Functions
Next: 11.2.4 String Conversion Functions
Previous Page: 11.2.2 String Manipulation: strlen() and strcpy()
Next Page: 11.2.4 String Conversion Functions

11.2.3 String Operations: strcmp() and strcat()

In the last section we saw how a string can be copied and how to determine the length of a string. Two other common operations on strings are to compare them and to join strings, i.e. concatenate them.

Our next task is to read lines of text, until a blank line is entered, and examine each line to see if it is the same as a ``control string''. If a line equals the control string, the line is ignored; otherwise, it is appended to a buffer. When a blank line is encountered in the input, the buffer is printed. The control string is assumed to be entered as the first line. Here is the task:

JOIN: Read a first line as the control string. Read other lines until a blank line is entered, either adding each line to a buffer or discarding it. A line is discarded if it equals the control string. When a line is added to the buffer, separate it from the previous text by a space. Print the buffer at the end of input.

The algorithm will require several functions: one to compare strings, another to append (i.e. concatenate) one string to another. Here is the algorithm:

initialize the buffer to an empty string
    read the first line into the control string

while not a blank line, read a line if the new line is not equal to the control line then if the buffer is not empty, append a space to the buffer append the new line to the buffer

print the buffer

The two new string operations we will need are provided by the standard library. We will use them to implement our algorithm. The first function compares two strings:

int strcmp(STRING s1, STRING s2);

The function, strcmp(), compares the strings, s1 and s2, and returns an integer indicating the result of the comparison. If the two strings are equal, it returns a zero value. If the two strings are not equal, the function returns the difference between the first two unequal characters in the two strings. The returned value will be positive if s1 is lexicographically greater than s2, and negative if s1 is less than s2. Thus, the strcmp() function is the equivalent of a relational operator for strings.

The second function we need is to join two strings. Again, the standard library provides a function:

STRING strcat(STRING s1, STRING s2);

which concatenates (i.e. joins) the two strings, s1 and s2, and stores the result in s1. It returns s1, i.e. the pointer to the combined string. This is the equivalent of the addition operator for strings. The prototypes for these and other standard library string functions are in a header file, string.h.

We can now use these functions to implement our program as shown in Figure 11.4.

We first read a string into the variable, control, and initialize the buffer, text, to an empty string. The while loop then reads strings until a blank line is entered. Since the expression gets(s) reads a line of text and returns the destination pointer, s, *gets(s) is the first character of the string read into s. The expression is True if any non-empty string is entered. It is False when the first character of s is a NULL which occurs when an empty line (just a RETURN) is entered.

For each string read into s, we compare it with control. If they are not equal, we concatenate text and s. A space is concatenated to text if it is not empty, so that the concatenated strings are separated by a space. We have included a debug statement to print the accumulated buffer and its length. When the input terminates, the accumulated string, text, is printed. Here is a sample session:

Observe that string comparisons are case distinct, e.g. hello is not the same as Hello, so the first Hello in the input is discarded, while the second, hello, is not.

The function, strcmp(), can be used when we wish to search for a particular string or when we wish to order strings in lexicographic or dictionary order. Unfortunately, upper case and lower case values of a letter are not equal as shown above; therefore, we must change all strings to the same case (e.g. by using tolower()) for a case independent comparison.

To understand how these library functions work, let us write our own versions of functions strcmp() and strcat(), beginning with our_strcmp(). First, let us look in a little more detail of ``what'' strcmp() does. Given two strings, the comparison proceeds character by character until two unequal characters are encountered, or both the strings are exhausted. When two unequal characters are encountered, their difference is returned. If no unequal characters have been encountered when both strings have reached NULL, the two strings are identical, and zero is returned. Here are some examples of results using strcmp(string1, string2):

We can model our algorithm on this behavior of strcmp(). We traverse both strings until we arrive at a terminating NULL in either one. During traversal, we examine the corresponding characters in the strings to see if they are unequal. If so, we terminate the traversal loop. Otherwise, we continue the process. When the loop is terminated, we return the difference between the characters where we left off in the two strings.

Figure 11.5 shows the code implementing this algorithm. The while loop traverses strings s and t terminating when s points to a NULL character. Within the loop, the corresponding characters of the two strings are compared. If unequal characters are encountered, the loop is terminated, and the difference between the characters is returned. If the loop terminates because *s is zero, then no unequal characters have been encountered so far, but the string t may or may not be exhausted. In either case, *s - *t, i.e 0 - *t is returned. In particular, if t points to NULL (the string t is also exhausted), then the two strings are equal and zero is returned. Otherwise, the difference between the first unequal characters is returned. Note, we do not need to test for the end of the string t in the while condition. If t terminates before s, then the NULL at the end of string t will not compare equal to *s, and the loop will terminate anyway.

To write our_strcat(), we must append the second string to the end of the first string; so we must traverse the first string until we find the NULL. We can then copy the second string at this point in the first using strcpy(). The function returns the pointer to the destination string, i.e. the beginning of the first string.

Since the function must return a pointer to the original string, s, we save the original pointer in a variable, p. We then increment s until it points to the terminating NULL. We then copy t into s starting at the NULL character position using strcpy(), and return the saved pointer, p. This function performs the same task as does strcat().



Previous: 11.2.2 String Manipulation: strlen() and strcpy()
Up: 11.2 Library String Functions
Next: 11.2.4 String Conversion Functions
Previous Page: 11.2.2 String Manipulation: strlen() and strcpy()
Next Page: 11.2.4 String Conversion Functions

tep@wiliki.eng.hawaii.edu
Sat Sep 3 07:04:57 HST 1994