3.2. Library string TypeThe string type supports variable-length character strings. The library takes care of managing the memory associated with storing the characters and provides various useful operations. The library string type is intended to be efficient enough for general use. As with any library type, programs that use strings must first include the associated header. Our programs will be shorter if we also provide an appropriate using declaration: #include <string> using std::string; 3.2.1. Defining and Initializing stringsThe string library provides several constructors (Section 2.3.3, p. 49). A constructor is a special member function that defines how objects of that type can be initialized. Table 3.1 on the facing page lists the most commonly used string constructors. The default constructor (Section 2.3.4, p. 52) is used "by default" when no initializer is specified.
3.2.2. Reading and Writing stringsAs we saw in Chapter 1, we use the iostream library to read and write values of built-in types such as int, double, and so on. Similarly, we can use the iostream and string libraries to allow us to read and write strings using the standard input and output operators: // Note: #include and using declarations must be added to compile this code int main() { string s; // empty string cin >> s; // read whitespace-separated string into s cout << s << endl; // write s to the output return 0; } This program begins by defining a string named s. The next line, cin >> s; // read whitespace-separated string into s reads the standard input storing what is read into s. The string input operator:
So, if the input to this program is "Hello World!", (note leading and trailing spaces) then the output will be "Hello" with no extra spaces. The input and output operations behave similarly to the operators on the builtin types. In particular, the operators return their left-hand operand as their result. Thus, we can chain together multiple reads or writes: string s1, s2; cin >> s1 >> s2; // read first input into s1, second into s2 cout << s1 << s2 << endl; // write both strings If we give this version of the program the same input as in the previous paragraph, our output would be
HelloWorld!
The programs presented from this point on will assume that the needed #include and using declarations have been made. Reading an Unknown Number of stringsLike the input operators that read built-in types, the string input operator returns the stream from which it read. Therefore, we can use a string input operation as a condition, just as we did when reading ints in the program on page 18. The following program reads a set of strings from the standard input and writes what it has read, one string per line, to the standard output:
int main()
{
string word;
// read until end-of-file, writing each word to a new line
while (cin >> word)
cout << word << endl;
return 0;
}
In this case, we read into a string using the input operator. That operator returns the istream from which it read, and the while condition tests the stream after the read completes. If the stream is validit hasn't hit end-of-file or encountered an invalid inputthen the body of the while is executed and the value we read is printed to the standard output. Once we hit end-of-file, we fall out of the while. Using getline to Read an Entire LineThere is an additional useful string IO operation: getline. This is a function that takes both an input stream and a string. The getline function reads the next line of input from the stream and stores what it read, not including the newline, in its string argument. Unlike the input operator, getline does not ignore leading newlines. Whenever getline encounters a newline, even if it is the first character in the input, it stops reading the input and returns. The effect of encountering a newline as the first character in the input is that the string argument is set to the empty string. The getline function returns its istream argument so that, like the input operator, it can be used as a condition. For example, we could rewrite the previous program that wrote one word per line to write a line at a time instead:
int main()
{
string line;
// read line at time until end-of-file
while (getline(cin, line))
cout << line << endl;
return 0;
}
Because line does not contain a newline, we must write our own if we want the strings written one to a line. As usual, we use endl to write a newline and flush the output buffer.
3.2.3. Operations on stringsTable 3.2 on the next page lists the most commonly used string operations.
The string size and empty OperationsThe length of a string is the number of characters in the string. It is returned by the size operation: int main() { string st("The expense of spirit\n"); cout << "The size of " << st << "is " << st.size() << " characters, including the newline" << endl; return 0; } If we compile and execute this program it yields The size of The expense of spirit is 22 characters, including the newline Often it is useful to know whether a string is empty. One way we could do so would be to compare size with 0:
if (st.size() == 0)
// ok: empty
In this case, we don't really need to know how many characters are in the string; we are only interested in whether the size is zero. We can more directly answer this question by using the empty member:
if (st.empty())
// ok: empty
The empty function returns the bool (Section 2.1, p. 34) value true if the string contains no characters; otherwise, it returns false. string::size_typeIt might be logical to expect that size returns an int, or, thinking back to the note on page 38, an unsigned. Instead, the size operation returns a value of type string::size_type. This type requires a bit of explanation. The string classand many other library typesdefines several companion types. These companion types make it possible to use the library types in a machine-independent manner. The type size_type is one of these companion types. It is defined as a synonym for an unsigned typeeither unsigned int or unsigned longthat is guaranteed to be big enough to hold the size of any string. To use the size_type defined by string, we use the scope operator to say that the name size_type is defined in the string class.
Although we don't know the precise type of string::size_type, wedo know that it is an unsigned type (Section 2.1.1, p. 34). We also know that for a given type, the unsigned version can hold a positive value twice as large as the corresponding signed type can hold. This fact implies that the largest string could be twice as large as the size an int can hold. Another problem with using an int is that on some machines the size of an int is too small to hold the size of even plausibly large strings. For example, if a machine has 16-bit ints, then the largest string an int could represent would have 32,767 characters. A string that held the contents of a file could easily exceed this size. The safest way to hold the size of a string is to use the type the library defines for this purpose, which is string::size_type. The string Relational OperatorsThe string class defines several operators that compare two string values. Each of these operators works by comparing the characters from each string.
The equality operator compares two strings, returning true if they are equal. Two strings are equal if they are the same length and contain the same characters. The library also defines != to test whether two strings are unequal. The relational operators <, <=, >, >= test whether one string is less than, less than or equal, greater than, or greater than or equal to another: string big = "big", small = "small"; string s1 = big; // s1 is a copy of big if (big == small) // false // ... if (big <= s1) // true, they're equal, so big is less than or equal to s1 // ... The relational operators compare strings using the same strategy as in a (case-sensitive) dictionary:
As an example, given the strings string substr = "Hello"; string phrase = "Hello World"; string slang = "Hiya"; then substr is less than phrase, and slang is greater than either substr or phrase. Assignment for stringsIn general the library types strive to make it as easy to use a library type as it is to use a built-in type. To this end, most of the library types support assignment. In the case of strings, we can assign one string object to another: // st1 is an empty string, st2 is a copy of the literal string st1, st2 = "The expense of spirit"; st1 = st2; // replace st1 by a copy of st2 After the assignment, st1 contains a copy of the characters in st2. Most string library implementations go to some trouble to provide efficient implementations of operations such as assignment, but it is worth noting that conceptually, assignment requires a fair bit of work. It must delete the storage containing the characters associated with st1, allocate the storage needed to contain a copy of the characters associated with st2, and then copy those characters from st2 into this new storage. Adding Two stringsAddition on strings is defined as concatenation. That is, it is possible to concatenate two or more strings through the use of either the plus operator (+) or the compound assignment operator (+=) (Section 1.4.1, p. 13). Given the two strings string s1("hello, "); string s2("world\n"); we can concatenate the two strings to create a third string as follows:
string s3 = s1 + s2; // s3 is hello, world\n
If we wanted to append s2 to s1 directly, then we would use +=: s1 += s2; // equivalent to s1 = s1 + s2 Adding Character String Literals and stringsThe strings s1 and s2 included punctuation directly. We could achieve the same result by mixing string objects and string literals as follows: string s1("hello"); string s2("world"); string s3 = s1 + ", " + s2 + "\n"; When mixing strings and string literals, at least one operand to each + operator must be of string type: string s1 = "hello"; // no punctuation string s2 = "world"; string s3 = s1 + ", "; // ok: adding a string and a literal string s4 = "hello" + ", "; // error: no string operand string s5 = s1 + ", " + "world"; // ok: each + has string operand string s6 = "hello" + ", " + s2; // error: can't add string literals The initializations of s3 and s4 involve only a single operation. In these cases, it is easy to determine that the initialization of s3 is legal: We initialize s3 by adding a string and a string literal. The initialization of s4 attempts to add two string literals and is illegal. The initialization of s5 may appear surprising, but it works in much the same way as when we chain together input or output expressions (Section 1.2, p. 5). In this case, the string library defines addition to return a string. Thus, when we initialize s5, the subexpression s1 + ", " returns a string, which can be concatenated with the literal "world\n". It is as if we had written string tmp = s1 + ", "; // ok: + has a string operand s5 = tmp + "world"; // ok: + has a string operand On the other hand, the initialization of s6 is illegal. Looking at each subexpression in turn, we see that the first subexpression adds two string literals. There is no way to do so, and so the statement is in error. Fetching a Character from a stringThe string type uses the subscript ([ ]) operator to access the individual characters in the string. The subscript operator takes a size_type value that denotes the character position we wish to fetch. The value in the subscript is often called "the subscript" or "an index."
It is an error to use an index outside this range. We could use the subscript operator to print each character in a string on a separate line: string str("some string"); for (string::size_type ix = 0; ix != str.size(); ++ix) cout << str[ix] << endl; On each trip through the loop we fetch the next character from str, printing it followed by a newline. Subscripting Yields an LvalueRecall that a variable is an lvalue (Section 2.3.1, p. 45), and that the left-hand side of an assignment must be an lvalue. Like a variable, the value returned by the subscript operator is an lvalue. Hence, a subscript can be used on either side of an assignment. The following loop sets each character in str to an asterisk: for (string::size_type ix = 0; ix != str.size(); ++ix) str[ix] = '*'; Computing Subscript ValuesAny expression that results in an integral value can be used as the index to the subscript operator. For example, assuming someval and someotherval are integral objects, we could write str[someotherval * someval] = someval; Although any integral type can be used as an index, the actual type of the index is string::size_type, which is an unsigned type.
When we subscript a string, we are responsible for ensuring that the index is "in range." By in range, we mean that the index is a number that, when assigned to a size_type, is a value in the range from 0 through the size of the string minus one. By using a string::size_type or another unsigned type as the index, we ensure that the subscript cannot be less than zero. As long as our index is an unsigned type, we need only check that it is less than the size of the string.
3.2.4. Dealing with the Characters of a stringOften we want to process the individual characters of a string. For example, we might want to know if a particular character is a whitespace character or whether the character is alphabetic or numeric. Table 3.3 on the facing page lists the functions that can be used on the characters in a string (or on any other char value). These functions are defined in the cctype header.
These functions mostly test the given character and return an int, which acts as a truth value. Each function returns zero if the test fails; otherwise, they return a (meaningless) nonzero value indicating that the character is of the requested kind. For these functions, a printable character is a character with a visible representation; whitespace is one of space, tab, vertical tab, return, newline, and formfeed; and punctuation is a printable character that is not a digit, a letter, or (printable) whitespace character such as space. As an example, we could use these functions to print the number of punctuation characters in a given string: string s("Hello World!!!"); string::size_type punct_cnt = 0; // count number of punctuation characters in s for (string::size_type index = 0; index != s.size(); ++index) if (ispunct(s[index])) ++punct_cnt; cout << punct_cnt << " punctuation characters in " << s << endl; The output of this program is
3 punctuation characters in Hello World!!!
Rather than returning a truth value, the tolower and toupper functions return a charactereither the argument unchanged or the lower- or uppercase version of the character. We could use tolower to change s to lowercase as follows: // convert s to lowercase for (string::size_type index = 0; index != s.size(); ++index) s[index] = tolower(s[index]); cout << s << endl; which generates
hello world!!!
![]() |