Slight Typos? Graphic Language and Other Curiosities
Difficulty: 5 / 10
Sometimes even small and hard-to-see typos can accidentally have a significant effect on code. To illustrate how hard typos can be to see, and how easy phantom typos are to see accidentally even when they're not there, consider these examples.
Problem
Guru Questions
Answer the following questions without using a compiler.
1. What is the output of the following program on a standards-conforming C++ compiler?
#include <iostream>
#include <iomanip>
int main()
{
int x = 1;
for( int i = 0; i < 100; ++i );
// What will the next line do? Increment???????????/
++x;
std::cout << x << std::endl;
}
2. How many distinct errors should be reported when compiling the following code on a conforming C++ compiler?
struct X {
static bool f( int* p )
{
return p && 0[p] and not p[1:>>p[2];
};
};
Solution
1. What is the output of the following program on a standards-conforming C++ compiler?
#include <iostream>
#include <iomanip>
int main()
{
int x = 1;
for( int i = 0; i < 100; ++i );
// What will the next line do? Increment???????????/
++x;
std::cout << x << std::endl;
}
Assuming that there is no invisible whitespace at the end of the comment line, the output is "1".
There are two tricks here, one obvious and one less so.
First, consider the for loop line:
for( int i = 0; i < 100; ++i );
^
There's a semicolon at the end, a "curiously recurring typo pattern" that (usually accidentally) makes the body of the for loop just the empty statement. Even though the following lines may be indented, and may even have braces around them, they are not part of the body of the for loop. This was a deliberate red herring -- in this case, because of the next point, it doesn't matter that the for loop never repeats any statements because there's no increment statement to be repeated at all (even though there appears to be one). This brings us to the second point:
Second, consider the comment line. Did you notice that it ends oddly, with a "/"?
// What will the next line do? Increment???????????/
^
Nikolai Smirnov writes:
"Probably, what's happened in the program is obvious for you but I lost a couple of days debugging a big program where I made a similar error. I put a comment line ending with a lot of question marks accidentally releasing the 'Shift' key at the end. The result is unexpected trigraph sequence '??/' which was converted to '\' (phase 1) which was annihilated with the following '\n' (phase 2)." [1]
The "??/" sequence is converted to '\' which, at the end of a line, is a line-splicing directive (surprise!). In this case, it splices the following line "++x;" to the end of the comment line and thus makes the increment part of the comment. The increment is never executed.
Interestingly, if you look at the Gnu g++ documentation for the -Wtrigraphs command-line switch, you will encounter the following statement:
"Warnings are not given for trigraphs within comments, as they do not affect the meaning of the program." [2]
That may be true most of the time, but here we have a case in point -- from real-world code, no less -- where this expectation does not hold.
2. How many distinct errors should be reported when compiling the following code on a conforming C++ compiler?
struct X {
static bool f( int* p )
{
return p && 0[p] and not p[1:>>p[2];
};
};
The short answer is: Zero. This code is perfectly legal and standards-conforming (whether the author might have wanted it to be or not).
Let's consider in turn each of the expressions that might be questionable, and see why they're really okay:
Of course, it could well be that the colon ":" was a typo and the author really meant "p[1]>>p[2]", but even if it was a typo it's still (unfortunately, in that case) perfectly legal code.
Acknowledgements
Thanks to Nikolai Smirnov for contributing part of the Example 1 code; I added the for loop line.
References
[1] N. Smirnov, private communication.
[2] A Google search for "trigraphs within comments" yields this and several other interesting and/or amusing hits.
[3] ISO/IEC 9899:1999 (E), International Standard, Programming Languages -- C.