Why You Should Give Your Preprocessor a Break

Published in

Geek Culture

6 min readAug 16, 2021

Disclaimer: These items are originally adapted from Effective C++ by Scott Meyers. This is the first post in a series of posts where I review and summarize the 55 items discussed by Meyers. The purpose of these posts is to serve as a personal review of the concepts discussed for myself, and also to provide a condensed version with additional commentary for the reader.

What Are the Compiler and Preprocessor?

Executing a program written in a high-level language like C++ requires that the program be broken down into the 1’s and 0’s that your CPU can understand. This language is known as machine language. C++ and other high level languages save us from having to write tedious assembly or even binary language. Computers, at their lowest levels, are electronic switches, cables, and logic gates. In order for your C++ to be broken down to machine language, it needs to be transformed. Compilers take your program and turn it into machine code files that your system can execute.

The preprocessor goes to work when a compiler is first run on your code. In C++, the preprocessor first looks for directives which are lines preceded with a ‘#.’ Preprocessor directives essentially give the compiler instructions to follow before compilation starts. The preprocessor is commonly used for to include other libraries and header files (#include), however, there are many other instances where they are commonly used such as defining function-like macros, constants, conditional compilation, and more.

Preprocessor Issues

Meyers begins listing out issues with the preprocessor with the example

#define ASPECT_RATIO 1.653

The preprocessor may remove this entire statement before the source code ever reaches the compiler. According to Meyers, “this can be confusing if you get an error during compilation involving the use of the constant, because the error message may refer to 1.653, not ASPECT_RATIO.” The issue stems from the symbolic constant never getting put into the symbol table due to omission from the compiler by the preprocessor. It would be preferred to replace a macro with a constant, such as

const double AspectRatio = 1.653;

Utilizing a language constant over a symbolic constant ensures that AspectRatio is seen by compilers and appropriately placed in the symbol table. Another benefit is that your code becomes smaller too. Using #define could result in the preprocessor blindly substituting ASPECT_RATIO in your object code. The language constant never results in more than one copy.

Special Cases

There are 2 special cases that Meyers’ lists when replacing #defines with constants. The first of these cases involves constant pointers. It is important to declare the pointer itself const in addition to the data being pointed to. According to Meyers, this is because constant definitions are put in header files where many different sources files will include them. Here is an example of defining a constant char*-based string in a header file.

const char * const authorName = "Masashi Kishimoto";

The point of the code snippet is to highlight that when declaring both a pointer and the item it points to const, the keyword must be used twice. Meyers’ notes that its preferable to to use string objects over char*-based ones.

const std::string authorName("Masashi Kishimoto");

The second special case involves class-specific constants. Making the const a member of the class limits its scope to the class only. Making it static also ensures that there is at most one copy of the const.

class GamePlayer{
private:
    static const int NumTurns = 5;
    int scores[NumTurns];
};

Class specific constants that are both static and integral are an exception to the rule that everything must be defined before it is used. The example above is a declaration of NumTurns rather than a definition. According to Meyers, “as long as you don’t take their address, you can declare them and use them without providing a definition.” A separate definition would be required if you were to take the address, such as:

const int GamePlayer::NumTurns;

How can this be a definition if no value is defined? Because this snippet would go in an implementation file rather than a header file. “The initial value of class constants is provided where the constant is declared (e.g., NumTurns is initialied to 5 when it is declared)”

It’s important to note that #define’s do not respect scope either, so there is in fact no class-specific way to create a constant with using the #define directive. “Once a macro is defined, it’s in force for the rest of the compilation.” The lack of respect for scope also implies that there is no macro-way to provide encapsulation either.

Meyers’ goes on to note that “in-class initialization is allowed only for integral types and only for constants. In cases where the above syntax can’t be used, you put the initial value at the point of definition:”

class CostEstimate{
    private:
        static const double FudgeFactor; //declaration goes in //header
};const double CostEstimate::FudgeFactor = 1.35; //definition goes in //implementation

The Enum Hack

Most of the time, the above examples will suffice. However, an exception may arise when you need the value of a class constant during compilation of the class, for example, an array declaration with a const passed for the size. The enum hack is a way to compensate for compilers that forbid in-class specification of initial values for static integral class constants. This is a clever approach that utilizes the fact that enumerated type values can be used where ints are expected.

class GamePlayer{
    private:
        enum {NumTurns = 5}; //NumTurns is symbolic name for 5
        int scores[NumTurns];
};

Meyers notes that “the enum hack behaves in some ways more like a #define than a const does, and sometimes that’s what you want.” One may take the address of a const, however, that would be an illegal operation for both enums and #define. Using the enum hack provides a constraint against people attempting to get a pointer or reference to your constant integrals. Enum constants also prevent unnecessary memory compilation that may be performed by poor compilers. Meyers’ concludes his discussion of the enum hack by stating that it is a fundamental technique of template metaprogramming, and it would do well to be able to recognize it when you see it.

Back to the Preprocessor

Implementing macros that look like functions is another common misuse of #define.

//call f with the maximum of a and b
#define CALL_WITH_MAX(a,b) f((a) > (b) ? (a) : (b))

Drawbacks include having to parenthesize all arguments in the macro body to avoid unexpected behavior. Meyers notes that even if that paradigm is followed, “weird things can happen.”

int a = 5, b = 0;
CALL_WITH_MAX(++a,b); //a is incremented twice
CALL_WITH_MAX(++a, B=10); //a is incremented once

You can get the efficiency of a macro with the safety of a function by utilizing a template for an inline function.

template<typename T>
inline void callWithMax(const T& a, const T& b){
    f(a > b ? a : b);
}

In the above example, we pass by reference-to-const because we don’t know what T is. Because this function is a real function, scope and access rules are respected. There would simply be no way to implement a class-specific inline function with a macro.

Conclusion

In addition to being generally safer and more efficient, utilizing the methods above will lead to a reduction of errors generated by the preprocessor and save you time debugging. However, there are situations where using the preprocessor is unavoidable (#include, conditional compilation, etc). Meyers lists out two things to remember:

“For simple constants, prefer const objects or enums to #defines”
“For function like macros, prefer inline functions to #defines”