Yesterday I got into the following problem. I want to allow certain C++ struct definitions in my code to be reflectively inspected. For example, if I defined a struct S
with two int
fields a
and b
, the other parts of my program should be able to know that the struct S
contains such two fields with such definitions and can act upon this information.
Trivially, one can solve this problem by maintaining two pieces of code: the original definition and a map like { 'a': 'int', 'b': 'int'}
. But then the two pieces of code must be manually kept in sync, which is the point I want to avoid.
Such use case is known as reflection. Unfortunately the C++ standard does not have native support for reflection. There are paper proposals to support it, but none of the major compilers seem to have implemented them yet.
The problem can also be solved via a huge hack called “the C++ Type Loophole”. However, it’s unclear why the hack could work, and it’s so hacky that even the C++ standard committee has reached a decision that it should be prohibited. So I’m not brave enough to use this hack.
I eventually reached a less hacky (but of course, less powerful) solution. Since there is no native support for reflection, something has to be instrumented into the definitions of the structs. So in my solution, to make a struct reflective, one must use special macro FIELD(type, name)
to define the fields: this allows us to automatically add some instrumentation into it. An example is shown below.
1 | struct S { |
My trick is based on the C __COUNTER__
macro. __COUNTER__
is a special macro that each time it is encountered, it is replaced by the current value of an internal counter maintained by the compiler, and then the internal counter is incremented. So each __COUNTER__
macro is replaced by an unique integer that monotically increases through each occurrance in the program text.
Note that the __COUNTER__
macro is replaced on the spot. For example, a ## __COUNTER__ = a ## __COUNTER__ + 1
will not increment the variable, since it’s going to be expanded to something like a1 = a2 + 1
. So the common use pattern of __COUNTER__
is to pass it to another macro as a parameter, as shown below:
1 |
This way, the MY_MACRO_IMPL
macro can use its counter
parameter as an unqiue integer.
The core of the trick is to use this __COUNTER__
macro to specialize templates. For a simple example, let’s assume we want to define structs that we can reflectively retrieve the number of fields in the struct. Then BEGIN_FIELDS_LIST
can expands to the following code:
1 | template<int o> struct __internal : __internal<o - 1> { }; |
And each FIELD
macro will expand to the normal definition, as well as the following code:
1 | template<> struct __internal<counter> { |
And the END_FIELDS_LIST
macro will expand to the following code:
1 | constexpr static size_t __numFields = __internal<counter>::N; |
To summarize, the idea is the following.
- The
BEGIN_FIELDS_LIST
will define the general case of a template specialized by an integero
. The general definition will simply inherit whatever information is computed by templateo-1
. In addition to that, it also defines the recursion boundary condition (in our example, since we want to count the number of fields,N = 0
). - Each
FIELD
defintion specializes__internal<__COUNTER__>
, and computes its information by merging the results incounter-1
and itself (in our example,N = __internal<counter-1>::N + 1
). END_FIELDS_LIST
can retrieve the aggregated results in__internal<__COUNTER__>
.
As one can see, the correctness of the above approach relies on only that each counter is replaced by a monotonically increasing integer. The starting integer value, or if any integer is skipped in the sequence, do not affect the correctness. And this matches exactly the semantics of the __COUNTER__
macro in C. So are we good?
One tricky problem arises from translation units. C/C++ compiler works on translation units (C/C++ files). So if a header file containing our definition is included by multiple source files, we may get different counter values in different translation units. In other words, the __internal
struct is specialized differently in different translation units. This doesn’t affect our correctness. However, the important thing is that this violates C++'s one-definition rule.
Fortunately, we are not doomed. C++ standard specifies that a constexpr symbol is only emitted if it is used by non-constexpr code. Since the __internal
structs are only used to compute our final constexpr result __numFields
, the compiler is guaranteed to not emit anything about the __internal
structs. So no violations of the one-definition rule can be observed. And if we need to add non-constexpr functions to the __internal
struct, we can also mark it as always_inline
(which tells the compiler that the function must be inlined for correctness) to make sure nothing about the __internal
structs are emitted.
So to conclude, as long as we make sure that the __internal
structs are not used elsewhere other than computing the final results (which can be achieved by, for example, making all its members private and all its non-constexpr functions always_inline
), we should be fine with C++'s one-definition rule requirement.