GotW #57

Home Blog Talks Books & Articles Training & Consulting

On the
blog
RSS feed November 4: Other Concurrency Sessions at PDC
November 3
: PDC'09: Tutorial & Panel
October 26: Hoare on Testing
October 23
: Deprecating export Considered for ISO C++0x

This is the original GotW problem and solution substantially as posted to Usenet. See the book More Exceptional C++ (Addison-Wesley, 2002) for the most current solution to this GotW issue. The solutions in the book have been revised and expanded since their initial appearance in GotW. The book versions also incorporate corrections, new material, and conformance to the final ANSI/ISO C++ standard.

Recursive Declarations 
Difficulty: 6 / 10

Can you write a function that returns a pointer to itself? If so, why would you want to?

Problem

JG Question

1. What is a pointer to function? How can it be used?

Guru Questions

2. Assume that it is possible to write a function that can return a pointer to itself. Then that function could equally well return a pointer to any function with the same signature as itself. When might this be useful? Explain.

3. Is it possible to write a function f() that returns a pointer to itself? It should be usable in the following natural way:

    //  Example 3
    //
    //  FuncPtr is a typedef for a pointer to a
    //  function with the same signature as f()
    FuncPtr p = f();    // executes f()
    (*p)();             // executes f()

If it is possible, demonstrate how. If it is not possible, explain why.

Solution

Function Pointers: Recap

1. What is a pointer to function? How can it be used?

Just as a pointer to an object lets you dynamically point at an object of a given type, a pointer to a function lets you dynamically point at a function with a given signature. For example:

  //  Example 1
  //
  // Create a typedef called FPDoubleInt for a function
  // signature that takes a double and returns an int.
  //
  typedef int (*FPDoubleInt)( double );
  // Use it.
  //
  int f( double ) { /* ... */ }
  int g( double ) { /* ... */ }
  int h( double ) { /* ... */ }
  FPDoubleInt fp;
  fp = f;
  fp( 1.1 );     // calls f()
  fp = g;
  fp( 2.2 );     // calls g()
  fp = h;
  fp( 3.14 );    // calls h()

When applied well, pointers to functions can allow significant runtime flexibility. For example, the standard C function qsort() takes a pointer to a comparison function with a given signature, which allows calling code to extend qsort()'s behavior by providing a custom comparison function.

A Brief Look at State Machines

2. Assume that it is possible to write a function that can return a pointer to itself. Then that function could equally well return a pointer to any function with the same signature as itself. When might this be useful? Explain.

Many situations come to mind, but one common example is when implementing a state machine.

In brief, a state machine is composed of a set of possible states along with a set of legal transitions between those states. For example, a simple state machine might look something like this:

  // Figure: A small state machine
          "a"
  Start -------> S2
    |             |
    |             | "see"
    |   "be"      v
    `---------> Stop

Here, when we are in the Start state, if we receive the input "a" we transition to state S2, otherwise if we receive the input "be" we transition to state Stop; any other input is illegal in state Start. In state S2, if we receive the input "see" we transition to state Stop; any other input is illegal in state S2. For this state machine, there are only two valid input streams: "be" and "asee".

To implement a state machine, one method that is sometimes sufficient is to make each state a function. All of the state functions have the same signature and each one returns a pointer to the next function (state) to be called. For example, here is an oversimplified code fragment that illustrates the idea:

  //  Example 2
  //
  StatePtr Start( const string& input );
  StatePtr S2   ( const string& input );
  StatePtr Stop ( const string& input );
  StatePtr Error( const string& input ); // error state
  StatePtr Start( const string& input )
  {
    if( input == "a" )
    {
      return S2;
    }
    else if( input == "be" )
    {
      return Stop;
    }
    else
    {
      return Error;
    }
  }

See your favorite computer science textbook for more information about state machines and their uses. [A future GotW issue may address the question: "What is the best way to implement a state machine in C++?"]

Of course, the above code doesn't say what "StatePtr" is, and it's not necessarily obvious how to say what it should be. This leads us nicely into the final question:

How Can a Function Return a Pointer to Itself?

3. Is it possible to write a function f() that returns a pointer to itself? It should be usable in the following natural way:

    //  Example 3
    //
    //  FuncPtr is a typedef for a pointer to a
    //  function with the same signature as f()
    FuncPtr p = f();    // executes f()
    (*p)();             // executes f()

If it is possible, demonstrate how. If it is not possible, explain why.

Short answer: Yes, it's possible, but it's not obvious.

For example, one might first try something like this:

  //  Example 3(a): Naive attempt
  //                (doesn't work)
  //
  typedef FuncPtr (*FuncPtr)(); // error

Alas, this kind of recursive typedef isn't allowed. Some people, thinking that the type system is just getting in the way, try to do an end run around the type system "nuisance" by casting to and from void*:

  //  Example 3(b): A nonstandard and
  //                nonportable hack
  //
  typedef void* (*FuncPtr)();
  void* f() { return (void*)f; }  // cast to void*
  FuncPtr p = (FuncPtr)(f());     // cast from void*
  p();

This isn't a solution because it doesn't satisfy the criteria of the question. Worse, it is dangerous because it deliberately defeats the type system, and it is onerous because it forces all users of f() to write casts. Worst of all, Example 3(b) is not supported by the standard. Although a void* is guaranteed to be big enough to hold the value of any object pointer, it is not guaranteed to be suitable to hold a function pointer; for example, on some platforms a function pointer is larger than an object pointer. Even if code like Example 3(b) happens to work on the compiler you're using today, it's not portable and may not work on another compiler, or even on the next release of your current compiler.

Of course, one way around this is to cast to and from another function pointer type instead of a simple void*:

  //  Example 3(c): Standard and portable, but
  //                nonetheless still a hack
  //
  typedef void (*VoidFuncPtr)();
  typedef VoidFuncPtr (*FuncPtr)();
  VoidFuncPtr f() { return (VoidFuncPtr)f; }
                            // cast to VoidFuncPtr
  FuncPtr p = (FuncPtr)f(); // cast from VoidFuncPtr
  p();

This code is technically legal, but it has all but one of the major disadvantages of Example 3(b): It is a dangerous hack because it deliberately defeats the type system. It imposes an unacceptable burden on all users of f() by requiring them to write casts. And, of course, it's still not really a solution at all because doesn't meet the requirements of the problem.

Can we really do no better?

A Correct and Portable Way

Fortunately, we can indeed get exactly the intended effect required by Question #3 in a completely type-safe and portable way, without relying on nonstandard code or type-unsafe casting. The way to do it is to use a proxy class that takes, and has an implicit conversion to, the desired pointer type:

  //  Example 3(d): The correct solution
  //
  struct FuncPtr_;
  typedef FuncPtr_ (*FuncPtr)();
  struct FuncPtr_
  {
    FuncPtr_( FuncPtr pp ) : p( pp ) { }
    operator FuncPtr() { return p; }
    FuncPtr p;
  };

Now we can declare, define, and use f() naturally:

  FuncPtr_ f() { return f; } // natural return syntax
  int main()
  {
    FuncPtr p = f();  // natural usage syntax
    p();
  }

This solution has three main strengths:

1. It solves the problem as required. Better still, it's type-safe and portable.

2. Its machinery is transparent: You get natural syntax for the caller/user, and natural syntax for the function's own "return myname;" statement.

3. It probably has zero overhead: On modern compilers, the proxy class, with its storage and functions, should inline and optimize away to nothing.

Coda

Of course, normally a special-purpose FuncPtr_ proxy class like this (that contains some old object and doesn't really care much about its type) just cries out to be templatized into a general-purpose Holder proxy. Alas, you can't just templatize the FuncPtr_ class above, because then the typedef would have to look something like:

  typedef Holder<FuncPtr> (*FuncPtr)();

which is self-referential.

Copyright 2009 Herb Sutter