If there is one thing I don't like, that's whining.
The second thing I don't like is quite naturally... whiners.
And guess what? Macros in C/C++ cause a LOT of whining.
I can find on the web a myriad of articles on why NOT to use C macros, on why C macros are evil/stupid/useless/dangerous, you name it, repeating the same old stuff and defying common sense.
But people still use macros. And I see in the stats of my blog that my previous post "Tips on writing C macros" is persistently popular. Clearly a lot of people are still writing C macros.
Are these people reckless or plain idiots? Of course not! There is great value in macros, real-world value not theoretical mumbo jumbo.
So I decided it is high time that I wrote an article describing some good real-world uses for C macros and at the same time debunk some of the anti-macro myths or at least put them in the proper perspective.
There are of countless macro related articles on the web, so I will try not to repeat too many things here :-)
#define POST_SIZE ULONGLONG // if you get my point ;-)
Let's start with the simplest macros of all, some lowly #defines:
#define DEB_CSR_ADR(ofs) (0xF0001020 + (ofs)*4)
#define WM_DEVICE_CHANGE (WM_USER+301)
This kind of usage, doesn't offer us anything much really.
You can write a function for the first one, which will probably get inlined by the compiler, and you can declare a const for the second one, or even better an enum, if that's a 'family' of symbolic names that it belongs to.
The const declarations will normally not take up any space in your final program (unless you take the address of a constant), but if you have a lot of constants under the same 'family', like in the case of Windows WM_xxx constants, then better use an enum instead.
But remember that these consts and enums will go into your compiler's symbol tables during compilation, regardless of whether you actually use these names or not, and that might slow down the compilation of large programs, especially if you have thousands of consts.
On the other hand the preprocessor has to build a 'symbol table' itself, but that's much simpler and probably much smaller and short-lived, used only during the preprocessor phase and then discarded.
So the macros above don't buy us that much, but on the other hand do we lose anything?
Some whiner might say "Oh but the 1st macro is not type safe; a function would be type safe!".
Dude... Are you the master of the obvious or what?
But hold on a second... If you accidentally used this macro incorrectly, in such a way that the compiler will accept to compile your code, for example pass as 'ofs' a C++ object with an overloaded *operator(), and you didn't really mean to do that, it was just an 'accident', then I am sorry but you should get a job in insurance sales; they deal with stupid accidents all the time.
Just to make a little play with words... My keyboard, any keyboard, is inherently not "type safe" :-P If you accidentally 'type' the wrong key, then it will not output your intended character.
When was the last time you heard anyone complain about that?
Don't get me wrong here. Type safety is an important argument against macros, or to put it rightly against some macros.
If you try to write a MAX7(a,b,c,d,e,f,g) "macro" then I would say DON'T because it is not type-safe.
But you cannot indiscriminately use the 'type safety' argument against any and every macro. Use a little common sense instead.
A more subtle problem with the use of #defines as shown above is '#define namespace polution'.
#defines are a preprocessor thing, so they don't obey C++ namespace rules, so essentially all #defines that you include end up in the same 'namespace' in the preprocessor.
Just for the shake of the argument, given the above macros, suppose someone tries to write a function called WM_DEVICE_CHANGE.
Ooops... the preprocessor will replace this and then you will get one of those compilation errors (or an avalanche of compilation errors) that give you a nice blank stare at your screen while you are wondering what hit you.
Even worse if you use two independent SDKs and they happen to #define the same name (#define SPEED 1000000) then that's not funny at all. C/C++ SDK providers should go to great lengths to avoid causing this kind of trouble to their customers.
But once again, use common sense:
- Don't include everything everywhere.
- Use 'sensibly unique' names for your macros. Avoid common names like VALIDATE, TEST, RETURN, and the like.
- It might help if all your macros follow some naming convention, for example have a small prefix ;-)
Let us now examine this byte-swap macro:
#define bswap(value) \
(((ULONG) (value)) << 24 |\
(((ULONG) (value)) & 0x0000FF00) << 8 |\
(((ULONG) (value)) & 0x00FF0000) >> 8 |\
((ULONG) (value)) >> 24)
This is very much like the first macro we examined.You don't gain much by writing it this way, instead of an inline function, which would give us type safety.But type safety is not a strong argument in this case too, so please don't whine about it.
What would be a more serious problem with this macro?
Argument substitution.
Consider this invocation: bwsap(GetHighResTickCount());
Ohoh... That's trouble. You will get 4 invocations to the function that returns the high resolution timer, and if the return values happen to be radically different, for example some wrapping takes place, then your result might be real garbage. Until you get that kind of 'luck' you might think all is well.
Similar things will happen if the macro argument has any side effects.
Last but not least, even if the function does not have side effects, 4 invocations are a performance degradation.
But once again, macro argument substitution is not rocket science, use common sense and don't whine about it.
The next point I have made in my previous post as well, but I think the 'argument' begs for a little 'expansion' :-)
A more subtle performance problem with the above macro would be this invocation: bswap(myArray[x]);
That's quite innocent looking, so why is it a problem?
Well, by definition myArray[x] == *(myArray+x), and that's the code the compiler generates.
This means a pointer addition AND a pointer dereference, times four.
The compiler cannot know if x or myArray[x] have changed between the calls, so the calculation and value fetching is done 4 times.
You might think that's not a big deal really, but at some point, while optimizing some video image manipulation code, I found a call similar to this:
SOME_MACRO(image[x+1][y], image[x][y+1], image[x][y]);
I changed this to:
const ULONG & arg1 = image[x+1][y];
const ULONG & arg2 = image[x][y+1];
const ULONG & arg3 = image[x][y];
SOME_MACRO(arg1, arg2, arg3);
The result???
A 10% improvement in performance, simply by optimizing out the repeated array access operations.
Plus I let the compiler know the thing is a constant value, so the compiler can do further optimizations at will. Once again, that is not rocket science, but 101 stuff.
Now let see what I consider a super cool usage for macros. How often did you have a list of error codes and also had to write a 'stringizing' function for them, one the gets the error code as argument and returns a string? That's pretty useful in debugging APIs, so I have done it several times my self.
Initially the bare bones code looks like this:
// MyErrors.h
#define ERROR_CODE_A 0x123456
...
#define ERROR_CODE_Z 0x987654
// MyErrorString.cpp
char * MyErrorString(ULONG a_ErrorCode)
{
switch (a_ErrorCode)
{
case ERROR_CODE_A: return "ERROR_CODE_A";
...
case ERROR_CODE_Z: return "ERROR_CODE_Z";
}
}
Now if we apply some of the rules we mentioned earlier and some macro we could slightly improve this as shown:
// MyErrors.h
enum MyErrors
{
ERROR_CODE_A = 0x123456,
...
ERROR_CODE_Z = 0x987654
};
// MyErrorString.cpp
#define MY_ERROR_CASE(a) case a: return #a
char * MyErrorString(enum MyErrors a_ErrorCode)
{
switch (a_ErrorCode)
{
MY_ERROR_CASE(ERROR_CODE_A);
...
MY_ERROR_CASE(ERROR_CODE_Z);
}
}
That may look fancier, but it is basically the same thing. What is the problem here? The problem is that every time you update the error list and add new constants, you have to remember to update the string conversion function. Did I mention that I don't like remembering things?
So I came up with this solution, which I am sure will #redefine your macro programming career :-P Well actually many people have done similar stuff one way or the other, so don't blame me for #redefining you!
Here goes:
// ErrorsList.h
ERROR_DEF(ERROR_CODE_A, 0x123456)
...
ERROR_DEF(ERROR_CODE_Z, 0x987654)
// MyErrors.h
#define ERROR_DEF(a,b) a = b,
enum MyErrors
{
#include "ErrorList.h"
};
#undef ERROR_DEF
// MyErrorString.cpp
#define MY_ERROR_CASE(a) case a: return #a
#define ERROR_DEF(a,b) MY_ERROR_CASE(a);
char * MyErrorString(enum MyErrors a_ErrorCode)
{
switch (a_ErrorCode)
{
#include "ErrorList.h"
}
}
#undef ERROR_DEF
So ErrorList.h is a file that contains the error names and values but does not compile by itself. Instead it gets included by other files according to the way they intend to use the error list. Update ErrorList.h, even deleting some values, and the only thing you need to do is recompile! Cool huh? :-)
Have fun!
Dimitrios Staikos
Comments