A Trick for Reflection in C++

2021-08-23

Yesterday I got into the following problem. I want to allow certain C++ struct definitions in my code to be reflectively inspected. For example, if I defined a struct S with two int fields a and b, the other parts of my program should be able to know that the struct S contains such two fields with such definitions and can act upon this information.

Trivially, one can solve this problem by maintaining two pieces of code: the original definition and a map like { 'a': 'int', 'b': 'int'}. But then the two pieces of code must be manually kept in sync, which is the point I want to avoid.

Such use case is known as reflection. Unfortunately the C++ standard does not have native support for reflection. There are paper proposals to support it, but none of the major compilers seem to have implemented them yet.

The problem can also be solved via a huge hack called “the C++ Type Loophole”. However, it’s unclear why the hack could work, and it’s so hacky that even the C++ standard committee has reached a decision that it should be prohibited. So I’m not brave enough to use this hack.

I eventually reached a less hacky (but of course, less powerful) solution. Since there is no native support for reflection, something has to be instrumented into the definitions of the structs. So in my solution, to make a struct reflective, one must use special macro FIELD(type, name) to define the fields: this allows us to automatically add some instrumentation into it. An example is shown below.

struct S {
    BEGIN_FIELDS_LIST();
    FIELD(int a);
    FIELD(double b);
    END_FIELDS_LIST();
};

My trick is based on the C __COUNTER__ macro. __COUNTER__ is a special macro that each time it is encountered, it is replaced by the current value of an internal counter maintained by the compiler, and then the internal counter is incremented. So each __COUNTER__ macro is replaced by an unique integer that monotically increases through each occurrance in the program text.

Note that the __COUNTER__ macro is replaced on the spot. For example, a ## __COUNTER__ = a ## __COUNTER__ + 1 will not increment the variable, since it’s going to be expanded to something like a1 = a2 + 1. So the common use pattern of __COUNTER__ is to pass it to another macro as a parameter, as shown below:

1 2	#define MY_MACRO_IMPL(a, b, counter) ... my impl ... #define MY_MACRO(a, b) MY_MACRO_IMPL(a, b, __COUNTER__)

This way, the MY_MACRO_IMPL macro can use its counter parameter as an unqiue integer.

The core of the trick is to use this __COUNTER__ macro to specialize templates. For a simple example, let’s assume we want to define structs that we can reflectively retrieve the number of fields in the struct. Then BEGIN_FIELDS_LIST can expands to the following code:

template<int o> struct __internal : __internal<o - 1> { };
template<> struct __internal<counter> {
  constexpr static size_t N = 0;
};

And each FIELD macro will expand to the normal definition, as well as the following code:

1
2
3

template<> struct __internal<counter> {
  constexpr static size_t N = __internal<counter - 1>::N + 1;
};

And the END_FIELDS_LIST macro will expand to the following code:

1	constexpr static size_t __numFields = __internal<counter>::N;

To summarize, the idea is the following.

The BEGIN_FIELDS_LIST will define the general case of a template specialized by an integer o. The general definition will simply inherit whatever information is computed by template o-1. In addition to that, it also defines the recursion boundary condition (in our example, since we want to count the number of fields, N = 0).
Each FIELD defintion specializes __internal<__COUNTER__>, and computes its information by merging the results in counter-1 and itself (in our example, N = __internal<counter-1>::N + 1).
END_FIELDS_LIST can retrieve the aggregated results in __internal<__COUNTER__>.

As one can see, the correctness of the above approach relies on only that each counter is replaced by a monotonically increasing integer. The starting integer value, or if any integer is skipped in the sequence, do not affect the correctness. And this matches exactly the semantics of the __COUNTER__ macro in C. So are we good?

One tricky problem arises from translation units. C/C++ compiler works on translation units (C/C++ files). So if a header file containing our definition is included by multiple source files, we may get different counter values in different translation units. In other words, the __internal struct is specialized differently in different translation units. This doesn’t affect our correctness. However, the important thing is that this violates C++'s one-definition rule.

Fortunately, we are not doomed. C++ standard specifies that a constexpr symbol is only emitted if it is used by non-constexpr code. Since the __internal structs are only used to compute our final constexpr result __numFields, the compiler is guaranteed to not emit anything about the __internal structs. So no violations of the one-definition rule can be observed. And if we need to add non-constexpr functions to the __internal struct, we can also mark it as always_inline (which tells the compiler that the function must be inlined for correctness) to make sure nothing about the __internal structs are emitted.

So to conclude, as long as we make sure that the __internal structs are not used elsewhere other than computing the final results (which can be achieved by, for example, making all its members private and all its non-constexpr functions always_inline), we should be fine with C++'s one-definition rule requirement.

A Trick for Reflection in C++

Archives

Recents