Previous: 7.1.1 Declaring Arrays
Up: A Compound Data Type --- array
Previous Page: 7.1.1 Declaring Arrays
Next Page: 7.2 Passing Arrays to Functions
Our next task is to store and print non-numeric text data, i.e. a sequence of characters which are called strings. A string is an list (or string) of characters stored contiguously with a marker to indicate the end of the string. Let us consider the task:
STRING0: Read and store a string of characters and print it out.
Since the characters of a string are stored contiguously, we can easily implement a string by using an array of characters if we keep track of the number of elements stored in the array. However, common operations on strings include breaking them up into parts (called substrings), joining them together to create new strings, replacing parts of them with other strings, etc. There must be some way of detecting the size of a current valid string stored in an array of characters.
In C, a string of characters is stored in successive elements of a character array and terminated by the NULL character. For example, the string "Hello" is stored in a character array, msg[], as follows:
char msg[SIZE];The NULL character is written using the escape sequence 'msg[0] = 'H'; msg[1] = 'e'; msg[2] = 'l'; msg[3] = 'l'; msg[4] = 'o'; msg[5] = '\0';
"Hello"are automatically terminated by NULL by the compiler.
Given this implementation of strings in C, the algorithm to implement our task is now easily written. We will assume that a string input is a sequence of characters terminated by a newline character. (The newline character is not part of the string). Here is the algorithm:
initialize index to zero
while not a newline character
read and store a character in the array at the next index
increment the index value
terminate the string of characters in the array with a NULL char.
initialize index to zero
traverse the array until a NULL character is reached
print the array character at index
increment the index value
The program implementation has:
Sample Session:
The next while loop in the program traverses the string and prints each character until a NULL character is reached. Note, we do not need to keep a count of the number of characters stored in the array in this program since the first NULL character encountered indicates the end of the string. In our program, when the first NULL is reached we terminate the string output with a newline.
The assignment expression in the above program:
msg[i] = '\0';can also be written as:
msg[i] = NULL;or:
msg[i] = 0;In the first case, the character whose ASCII value is 0 is assigned to ; where in the other cases, a zero value is assigned to msg[i]. The above assignment expressions are identical. The first expression makes it clear that a null character is assigned to msg[i], but the second uses a symbolic constant which is easier to read and understand.
To accommodate the terminating NULL character, the size of an array that houses a string must be at least one greater than the expected maximum size of string. Since different strings may be stored in an array at different times, the first NULL character in the array delimits a valid strin. The importance of the NULL character to signal the end of a valid string is obvious. If there were no NULL character inserted after the valid string, the loop traversal would continue to print values interpreted as characters, possibly beyond the array boundary until it fortuitously found a (0) character.
The second while loop may also be written:
while (msg[i] != NULL)
putchar(msg[i++]);
and the while condition further simplified as:
while (msg[i])
putchar(msg[i++]);
If msg[i] is any character with a non-zero ASCII value,
the while expression
evaluates to True. If msg[i] is the NULL character,
its value is zero and thus
False. The last form of the while condition is the more common usage.
While we
have used the increment operator in the putchar() argument, it may also be used
separately for clarity:
while (msg[i]) {
putchar(msg[i]);
i++;
}
It is possible for a string to be empty; that is, a string may have no characters in it. An empty string is a character array with the NULL character in the zeroth index position, msg[0].