Boost.Preprocessors Sequences

July 3, 2010 2 Comments by Georg Fritzsche

After recently answering a question on Stackoverflow using Boost.Preprocessor, i wondered how the sequences, e.g. SUM_MACRO((1)(2)(3)), actually work.

Implementing Macros operating on sequences

Sequences give you a convenient array-like data-structure wich can be passed to macros:

#define HEAD(sequence) \
    BOOST_PP_SEQ_ELEM(0, sequence)

std::cout << HEAD((0)(1)(2)); // prints "0"

Now how can that be implemented? First there are two characters that have special status for the preprocessor: commas and parentheses.
Parentheses can be used to

  • wrap expressions to be left alone by macro expansion, e.g. because they contain commas
  • can be bound to macro names as parameter lists

The binding of parameter lists doesn't have to happen immediately, so the following works fine:

#define IDENTITY(x) x
#define HEAD(sequence) IDENTITY sequence
std::cout << HEAD((42)); // prints "42"

The macro-expansion here is the following:

HEAD((42))
IDENTITY (42)
42

That already works fine for a one-element sequence, but what if it had more elements:

std::cout << HEAD((42)(23));
// results in:
std::cout << 42(23);

... which is obviously wrong. We need to get rid of the rest of the sequence by going through another macro:

#define FIRST(x) x, DOESNT_EXIST
#define HEAD(sequence) FIRST sequence
std::cout << HEAD((42)(23));

This now results in:

HEAD((42)(23))
FIRST (42)(23) // FIRST binds (42)
42, DOESNT_EXIST(23)

Passing this result through another macro that takes two arguments, we can ignore the second part:

#define HEAD(sequence) IGNORE_SECOND(FIRST sequence)
#define FIRST(x) x, FOO
#define IGNORE_SECOND(x) IGNORE_SECOND_I(x) // needed for expansion of x
#define IGNORE_SECOND_I(x, _) x

Which now expands to:

HEAD((42)(23))
IGNORE_SECOND(FIRST (42)(23)) // FIRST binds (42)
IGNORE_SECOND_I(42, DOESNT_EXIST(23)) 
42

Getting elements by index

Now the macros we pass the comma-separated expansion to don't have to throw away the second part:

#define SECOND(sequence) IGNORE_SECOND(TAIL sequence)
#define TAIL(x) FIRST // throw away first element

... with which we get something like the following expansion:

SECOND((1)(2)(3))
IGNORE_SECOND(TAIL (1)(2)(3))
IGNORE_SECOND_I(FIRST (2)(3))
IGNORE_SECOND_I(2, DOESNT_EXIST(3)) 
2

Now combining that with the concatenation operator ## we can define macros that operate index-based:

#define GET_BY_INDEX(index, sequence) IGNORE_SECOND( GET_##index sequence )
#define GET_0(x) x, DOESNT_EXIST
#define GET_1(_) GET_0
#define GET_2(_) GET_1
// ...

... which could lead to the following expansion:

GET_BY_INDEX(2, (1)(2)(3)(4))
IGNORE_SECOND(GET_2 (1)(2)(3)(4))
IGNORE_SECOND_I(GET_1 (2)(3)(4))
IGNORE_SECOND_I(GET_0 (3)(4))
IGNORE_SECOND_I(3, DOESNT_EXIST(4)) 
3

Counting elements

To get the size of a sequence the elements in it need to be counted. The trick here is that substitution for macros that take arguments stops if there are no parameter-lists left that could be bound, so the following:

#define SIZE(sequence) SIZE_0 sequence
#define SIZE_0(_) SIZE_1
#define SIZE_1(_) SIZE_2
#define SIZE_2(_) SIZE_3
// ...

would expand SIZE((1)(2)) to SIZE_2. All that is left then is to map that name to a number:

#define SIZE(sequence) \
    CONCAT(SIZE_, SIZE_0 sequence) // concatenate SIZE_ with SIZE_N
#define CONCAT(a, b) CONCAT_I(a, b) // allow for macro expansion
#define CONCAT_I(a, b) a ## b
#define SIZE_SIZE_0 0 
#define SIZE_SIZE_1 1
#define SIZE_SIZE_2 2
// ...

... which would lead to expansions like:

SIZE((1)(2))
CONCAT(SIZE_, SIZE_0 (1)(2))
CONCAT_I(SIZE_, SIZE_0 (1)(2))
CONCAT_I(SIZE_, SIZE_1 (2))
CONCAT_I(SIZE_, SIZE_2)
SIZE_ ## SIZE_2
SIZE_SIZE_2
2

- Georg