Think Like A Warlo 1/3
why use a better language than c when you can break c 5head
Table of Contents
all errors in this blog post are my own. this is my first post to The Writing Gaggle (Which Still Doesn't Have A Button, Yet). i was going to add a bunch of explanation as to how safe concurrency with decent syntax can be achieved with , but unfortunately as soon as i began doing that it became an unsurmountable task of adding new bells and blood sacrifices and metametaprogramming. so, instead, here is the first ~third of it.
the c++, lisp, and rust programmers are really having you believe that c is
some sort of… Archaic monstrocity from '72 that hasn't got an update in 51
years, and can't handle fun concepts like Generic Programming, or Concurrency,
or Functional Programming. i object to all of that and present to you the
definitive guide on how to go from normal c to that will get you fired and
aused of insanity promoted and raise your salary because they can't get
anyone to maintain your ode.
1 The Correct Solution To Switch Statements
[prerequisites: understanding c78's switch
and #define
]
let's begin with a simple header.
I feel a combination of pride and revulsion at this discovery.
[…]
Many people (even [Brian Kernighan]?) have said that the worst feature of C is that switches don't break automatically before each case label. This code forms some sort of argument in that debate, but I'm not sure whether it's for or against.
- Tom Duff, on Duff's Device.
the most controversial choice in c, hands down, is that cases in a switch
statement fall-through automatically, and you have to insert a break in order
to get them to NOT fall-through. so, lets fix that!
the basic idea is that behind every case ...: ...
, we can just insert a
break
statement.
#define case break; case
this macro in fact does not expand to an infinite loop, because macros in C only expand once.
however there are two critical issues with this macro. first: we forgot about
default
.
#define case break; case #define default break; default
second, this macro causes Unexpected Behaviour. case X: Y
is, according to
the c78+ standards, one statement. our replacement> two statements.
switch (outer) { default: if (cond) case foo: { /* ...executes if outer != foo || cond is FALSE. */ } } switch (outer) { break; default: if (cond) break; case foo: { /* ...executes if outer != foo || cond is FALSE. */ } }
so, we need to use a hack to get it to actually be one statement:
#define case if (1) break; else case #define default if (1) break; else default
looks stupid, right? that's the type of macro you will be writing over and over again as you read this post.
on to our next header…
2 Iteration, Anyone?
[prerequisites: understanding c99's for
loops with initial variable decls]
it is time to do what chess players call, a ‼.
typedef struct int_list { int car; struct int_list *cdr; } int_list; int_list l = ...; for (int_list *lp = &l; lp; lp = lp->cdr) /* truly, a ⁇ moment */ printf("%d\n", lp->car);
one thing that is a tad annoying about this is it's cluttered! c is being too bare for our purposes. the correct solution is to use Literally Any Other Programming Language, however there is a correcter solution:
/* if only... */ FOREACH (int i IN l) printf("%d\n", i);
issue #1: we need to declare a new int i
in scope.
it's clear that we're going to have to define […a macro that…] must expand to a statement prefix: that is, something which you can put on the front of a valid C statement to yield another valid C statement.
- Simon Tatham, Metaprogramming custom control structures in C.
lets look at our options…
if (cond) { ... } while (cond) { ... } for (init; cond; inc) { ... } switch (control) { ... }
i see exactly one candidate: for
. specifically, init
can be a variable
declaration since c99, so all we need to do is make sure that cond
evaluates
to 1
the first time, then 0
onwards. the easiest way to do that is just
for (int temp = 1; temp; temp = 0)
hey remember what i said about the whole looking stupid thing?
#define WITH(declaration) \ for (int temp = 1; temp; temp = 0) \ for (declaration; temp; temp = 0) #define FOREACH_(declaration, list) \ for (int_list *lp = &(list); lp; lp = lp->cdr) \ WITH(declaration = lp->car) \
now, we can write:
FOREACH_ (int i, l) printf("%d\n", i);
…nah. not doing it for me. it looks too foreign. compare it to C++:
for (int i : l) printf("%d\n", i);
or python:
for i in l: print(i)
however, there is a solution: make the in
keyword expand to a comma!
#define IN , FOREACH_ (int i IN l) ... /* error: macro "FOREACH_" requires 2 arguments, but only 1 given */
"…huh?"
this is the part where we learn about Macro Expansion Order. macros are
expanded Lazily, so FOREACH_
"sees" (int i IN l)
, not (int i , l)
.
this is why i named the macro FOREACH_
, and not FOREACH
:
#define FOREACH(tokenstream) FOREACH_(tokenstream) FOREACH (int i IN l) ...
for the record, i stole this technique from cello.h.
new problem: break
now means continue
.
FOREACH (int i IN l) { printf("%d\n", i); /* 1 2 3 */ } FOREACH (int i IN l) { if (i == 2) continue; printf("%d\n", i); /* 1 3 */ } FOREACH (int i IN l) { if (i == 2) break; printf("%d\n", i); /* 1 3 */ } /* expanded: */ for (int_list *lp = &l; lp; lp = lp->cdr) for (int temp = 1; temp; temp = 0) /* ^^^^^^^^ jumps here when =break=ing */ for (int i = lp->car; temp; temp = 0) { /* ^^^^^^^^ jumps here when =continue=ing */ printf("%d\n", i); }
both break
and continue
do the exact same thing, however i see an easy
fix:
for (int_list *lp = &l; lp; lp = lp->cdr) for (int temp = 1; temp; temp = 0, lp = (void *) 0) /* force loop to end ^^^^^^^^^^^^^^^^^^^^^^^^^ */ for (int i = lp->car; temp; temp = 0) { printf("%d\n", i); }
so lets reimplement the macro, from the ground up:
#define IN , #define FOREACH_(declaration, list) \ for (int_list *lp = &(list); lp; lp = lp->cdr) \ for (int temp = 1; temp; temp = 0, lp = (void *) 0) \ for (declaration = lp->car; temp; temp = 0) #define FOREACH(tokenstream) FOREACH_(tokenstream) FOREACH (int i IN l) /* Gotcha! */ printf("%d\n", i);
$ cc -o main main.c $ ./main Segmentation Fault (core dumped) $
"…what? we shouldn't ever even have the chance to dereference NULL!"
turns out, we accidentally did. think about operations are executed if we
DON'T break
:
int_list *lp = &l; lp; int temp = 1; temp; int i = lp->car; temp; printf("%d\n", i); temp = 0; temp = 0, lp = (void *) 0; lp = lp->cdr; /* SEGMENTATION FAULT */
2 problems. problem #1: we accidentally set lp
to a null pointer even though
we never broke, because we forgot to check if temp
was already 0
.
here i am abusing the fact that &&
in C short-circuits, meaning that in
lhs && rhs
, if lhs
evaluates to 0
, then rhs
isn't even evaluated. so
you can use lhs && rhs
as a if (lhs) rhs
if you discard the result.
#define FOREACH_(declaration, list) \ for (int_list *lp = &(list); lp; lp = lp->cdr) \ for (int temp = 1; temp; temp && (lp = (void *) 0), temp = 0) \ for (declaration = lp->car; temp; temp = 0)
problem #2: we forgot to check in lp = lp->cdr
whether lp
was already set
to a null pointer.
#define FOREACH_(declaration, list) \ for (int_list *lp = &(list); lp; lp && (lp = lp->cdr)) \ for (int temp = 1; temp; temp && (lp = (void *) 0), temp = 0) \ for (declaration = lp->car; temp; temp = 0)
$ cc -o main main.c $ ./main 1 2 3 $
FOREACH (int i IN l) { if (i == 2) continue; printf("%d\n", i); /* 1, 3 */ } FOREACH (int i IN l) { if (i == 2) break; printf("%d\n", i); /* 1 */ }
we have now successfully implemented FOREACH
, however i am not content
enough with it– i want to be able to write both regular for loops AND
FOREACH
loops with the same syntax:
for (int i = 0; i < foo; i++) { ... } for (int i in bar) { ... }
2.1 Counting Arguments
[prerequisites: understanding c99's variadic macros]
for our case it's simple. int i in bar
expands to two arguments, but
int i = 0; i < foo; i++
expands to one. we just need to tell those two
cases apart, and i have just the trick:
#define DISPATCH(unary, binary, ...) \ THIRD(__VA_ARGS__, binary, unary, ~)(__VA_ARGS__) #define THIRD(a, b, c, ...) c
THIRD(foo, binary, unary, ~)
is always unary
, but
THIRD(foo, bar, binary, unary, ~)
is always binary
! neat.
now we can fully implement our custom syntax sugar:
#define for(...) DISPATCH(for, FOREACH, __VA_ARGS__) #define in IN for (int i in l) /* Gotcha! */ printf("%d\n", i);
$ cc -o main main.c main.c: In function ‘main’: main.c:32:20: error: macro "FOREACH" passed 2 arguments, but takes just 1 $
whoops. turns out, that in
already expanded to a comma by the time it
reached FOREACH
. we should just use FOREACH_
, instead.
#define for(...) DISPATCH(for, FOREACH_, __VA_ARGS__) #define in IN for (int i in l) printf("%d\n", i);
$ cc -o main main.c $ ./main 1 2 3 $
2.2 Racing The Macro Expander
for (int i = 0, j = 0; i + j < 10; i++) { ... }
$ cc -o main main.c main.c: In function ‘main’: main.c:21:20: error: ‘j’ undeclared (first use in this function) for (int i = 0, j = 0; i + j < 10; i++) { main.c:21:25: error: expected ‘)’ before ‘;’ token for (int i = 0, j = 0; i + j < 10; i++) { main.c:10:21: error: lvalue required as left operand of assignment for (declaration = lp->car; temp; temp = 0) $
Wow, That Is A Horrible Error Message.
what actually happening is that the preprocessor thinks we called for
with
two arguments: int i = 0
, and j = 0; i + j < 10; i++
. therefore it tries
to treat it as if it was a FOREACH
statement. which causes Mayhem. this is
because the preprocessor doesn't completely understand C's syntax, however,
it does understand parenthesis.
the simplest workaround is rediculously stupid:
step 1. change the definition of in
to rely on it being expanded within
a pair of parenthesis
#define in ), (
step 2. delay the evaluation of for
's arguments + surround them in parens
to prevent in
from Exploding
#define for(...) for_((__VA_ARGS__)) #define for_(...) DISPATCH(for_1, for_2, __VA_ARGS__)
step 3. get rid of the parens
#define for_1(parenthesized) for parenthesized #define for_2(left, right) FOREACH_(ID left, ID right) #define ID(...) __VA_ARGS__
$ cc -o main main.c $
it now does exactly what you think it does.
2.3 Variable Argument Macros
while variadic macros seem useful, in practice you run into a very annoying
limitation: it's hard to mess with the number of arguments. i already
mentioned a significantly less complicated example of how to do this with
DISPATCH
, however, it can be generalized. introducing: PP_NARG
:
#define PP_ARG_N(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, \ _11, _12, _13, _14, _15, _16, _17, _18, _19, _20, \ _21, _22, _23, _24, _25, _26, _27, _28, _29, _30, \ _31, _32, _33, _34, _35, _36, _37, _38, _39, _40, \ _41, _42, _43, _44, _45, _46, _47, _48, _49, _50, \ _51, _52, _53, _54, _55, _56, _57, _58, _59, _60, \ _61, _62, _63, N, ...) N #define PP_RSEQ_N() \ 63, 62, 61, 60, \ 59, 58, 57, 56, 55, 54, 53, 52, 51, 50, \ 49, 48, 47, 46, 45, 44, 43, 42, 41, 40, \ 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, \ 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, \ 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, \ 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 #define PP_NARG_(...) PP_ARG_N(__VA_ARGS__) #define PP_NARG(...) PP_NARG_(__VA_ARGS__, PP_RSEQ_N())
same idea, extended to 64 arguments, and instead of unary
and binary
you
get back 1
and 2
. we can now use this to make constructing int_list
s
significantly less painful.
here's the code i've been hiding from you:
int_list l = (int_list) { .car = 1, .cdr = &((int_list) { .car = 2, .cdr = &((int_list) { .car = 3, .cdr = (void *) 0 }) }) };
now it would be neat if we could just write, like…
int_list l = new_int_list(1, 2, 3);
and, as it turns out, this is very possible.
we will need three new macros though:
#define PP_PASTE3(x, y, z) x ## y ## z #define PP_CAT3(x, y, z) PP_PASTE3(x, y, z)
again, the "looks stupid" thing.
#define MACRO EXPANDED PP_PASTE3(foo, _, MACRO) /* foo_MACRO */ PP_CAT3(foo, _, MACRO) /* foo_EXPANDED */
now for the most useful macro:
#define PP_OVERLOAD(f, ...) PP_CAT3(f, _, PP_NARG(__VA_ARGS__))(__VA_ARGS__) #define foo(...) PP_OVERLOAD(foo, __VA_ARGS__) #define foo_1(a) ... #define foo_2(a, b) ...
now, the code we're about to write is going to be Extremely Painful, so in order to minimize that pain, let's write a small C program to automatically generate most of it.
the code is going to look like this:
#define new_int_list_1(x) \ ((int_list) { .car = (x), .cdr = (void *) 0 }) #define new_int_list_2(x, ...) \ ((int_list) { \ .car = (x), \ .cdr = &new_int_list_1(__VA_ARGS__) \ }) #define new_int_list_3(x, ...) \ ((int_list) { \ .car = (x), \ .cdr = &new_int_list_2(__VA_ARGS__) \ })
i specifically designed this to be easy to generate automatically.
#include <stdio.h> int main(void) { for (int i = 2; i < 64; i++) printf("#define new_int_list_%d(x, ...) " "((int_list) { .car = (x)," " .cdr = &new_int_list_%d(__VA_ARGS__) })\n", i, i - 1 ); }
aaand let's just yoink that output… next we can define a couple lines before it all…
#define new_int_list(...) PP_OVERLOAD(new_int_list, __VA_ARGS__) #define new_int_list_1(x) \ ((int_list) { .car = (x), .cdr = (void *) 0 })
aaand we are done. we can now do this:
for (int i in new_int_list(5, 1, 9, 420, 69)) printf("%d\n", i);
which is Pretty Cool, but Not Cool Enough.
3 Generics?
[prerequisites: understanding c11's =Generic=]
if you played around with the examples above (which i encourage you to do!),
you might have noticed yet another annoying limitation: int_list
s ONLY
store int
s. whenever we want to add another type of list, say a
float_list
, we need to manually repeat all of the steps, and then
FOREACH_
breaks, silently, and ends up reading garbage data! the garbage
data is actually just the integer representation of a float, but it's still
extremely annoying. there's a simple way to fix that with GNU C extensions:
#define FOREACH_(declaration, ls) \ for (typeof(ls) *lp = &(ls); lp; lp && (lp = lp->cdr)) \ for (int temp = 1; temp; temp && (lp = (void *) 0), temp = 0) \ for (declaration = lp->car; temp; temp = 0)
however, this doesn't fix the problem of new_int_list
s being everywhere.
3.1 The X In X-Macro Stands For Moonshine
Any problem in computer science can be solved with another layer of indirection, except of course for the problem of too many indirections.
- Bjarne Stroustrup, 1st law of computing.
x-macros are a very powerful tool in C. the basic idea is simple, let's demonstrate it with a tagged union.
typedef struct astnode { enum { INTEGER, ADDITION, /* say we just added a revloutionary new operation: */ SUBTRACTION, } type; union { int INTEGER; struct { struct astnode *left; struct astnode *right} ADDITION; /* but forgot to update it here... */ } data; } astnode;
…then we will get a very annoying problem. the invalid code is possible to write at all. this is what X-macros are designed for.
#define AST_TYPES \ X(int, INTEGER) \ X(struct { struct astnode *left; struct astnode *right }, ADDITION) \ X(struct { struct astnode *left; struct astnode *right }, SUBTRACTION) typedef struct astnode { enum { #define X(type, name) name, AST_TYPES #undef X } type; union { #define X(type, name) type name; AST_TYPES #undef X } data; } astnode;
they eliminate repetition by reprocessing the same data in a different way. this is Macros-As-A-Data-Structure. however, now there is a critical improvement we can make that will allow us to do things never meant to be possible in C: seperate the macro into a different file.
/* main.c */ typedef struct astnode { enum { #define X(type, name) name, #include "AST_TYPES.def" } type; union { #define X(type, name) type name; #include "AST_TYPES.def" } data; } astnode; /* AST_TYPES.def */ X(int, INTEGER) X(struct { struct astnode *left; struct astnode *right }, ADDITION) X(struct { struct astnode *left; struct astnode *right }, SUBTRACTION) #undef X
we use the file extension .def
instead of .h
to mark this as an x-macro.
you only ever include .h
files once, including them again does nothing.
they are idempotent. .def
files are not idempotent.
why is this so powerful? because ordinary macros cannot #include
other
files. they can't do any #ifdef
s either. you have to use rediculously
flaky crocks for either of those, and you're lucky if they work in the first
place.
here's a short and simple example:
/* iota.h */ #ifndef IOTA #define IOTA 0 #define IOTA_RESET "iota-0.def" #define IOTA_NEXT "iota-1.def" #endif /* iota-0.def */ #undef IOTA #define IOTA 0 #undef IOTA_NEXT #define IOTA_NEXT "iota-1.def" /* iota-1.def */ #undef IOTA #define IOTA 1 #undef IOTA_NEXT #define IOTA_NEXT "iota-2.def" /* iota-2.def */ #undef IOTA #define IOTA 2 #undef IOTA_NEXT #define IOTA_NEXT "iota-3.def" /* ... */
well. i don't feel like writing the same code over and over again for different files, and i can't use a macro to do it for me, so…
#include <stdio.h> int main(void) { char buf[64]; for (int i = 0; i < 256; i++) { sprintf(buf, "iota-%d.def", i); FILE *f = fopen(buf, "w"); fprintf(f, "#undef IOTA\n" "#define IOTA %d\n" "#undef IOTA_NEXT\n" "#define IOTA_NEXT \"iota-%d.def\"\n", i, i + 1); fclose(f); } }
$ cc -w -o main main.c $ ./main $
wow. that is… a lot of files.
now, we can do this:
$ cat main.c #include "iota.h" int main(void) { printf("a: %d\n", IOTA); #include IOTA_NEXT printf("a: %d\n", IOTA); #include IOTA_NEXT printf("a: %d\n", IOTA); #include IOTA_NEXT printf("a: %d\n", IOTA); #include IOTA_NEXT printf("a: %d\n", IOTA); #include IOTA_RESET printf("b: %d\n", IOTA); #include IOTA_NEXT printf("b: %d\n", IOTA); #include IOTA_NEXT printf("b: %d\n", IOTA); #include IOTA_NEXT printf("b: %d\n", IOTA); #include IOTA_NEXT } $ cc -w -o main main.c $ ./main a: 0 a: 1 a: 2 a: 3 a: 4 b: 0 b: 1 b: 2 b: 3 $
yes. you can do that with #include
.
so, let's apply the idea of x-macros to our linked list. we will call the
type T
instead of X
beacuse obviously/.
/* list.def */ #ifndef T #error "T not defined when declaring list(T)" #endif #ifdef PP_PASTE3 #undef PP_PASTE3 #endif #define PP_PASTE3(x, y, z) x ## y ## z #ifdef PP_CAT3 #undef PP_CAT3 #endif #define PP_CAT3(x, y, z) PP_PASTE3(x, y, z) #ifdef list #undef list #endif #define list(T) struct PP_CAT3(_list_, T, _) struct PP_CAT3(_list_, T, _) { T car; struct PP_CAT3(_list_, T, _) *cdr; }; #undef T
however we are definitely not done yet. remember new_list
? we need to
somehow get that working in our new system. we want to be able to write
something along the lines of:
#ifndef new_list #define new_list (void) 0 #endif #define FIRST(car, ...) car #define new_list(...) \ _Generic((FIRST(__VA_ARGS__,~)), \ T: ..., \ default: LAST_new_list(__VA_ARGS__))
if you're wondering why i defined a second macro FIRST
specifically to
get the first argument, and didn't just write #define new_list(car, ...)
,
that's because if __VA_ARGS__
was empty on the call to LAST_new_list
,
it would produce a trailing comma, and incorrectly detect 2 arguments. the
, ~
part is also there to cope with this particular problem, because the
invocation FIRST(foo)
is nonstandard.
however all of that won't work for two reasons.
problem #1:
#define T int #define foo T #undef T #define T float foo /* expands to =float=. not =int=
remember. macros expand lazily. foo
expands to T
, and then T
expands to float
.
the solution is simple but took me a while to realize:
#ifndef new_list #define new_list (void) 0 #endif typedef T new_list_T; #define FIRST(car, ...) car #define new_list(...) \ _Generic((FIRST(__VA_ARGS__,~)), \ new_list_T: ..., \ default: LAST_new_list(__VA_ARGS__))
always remember that you can usually typedef y x;
, enum { x = y };
,
const T x = y;
, or static inline x(...) { y }
instead of
#define x y
or #define x(...) y
.
however this reinforces the second problem. we can't use this twice. both
because we will redefine new_list_T
and because LAST_new_list
can't
work. by the time we redefine new_list
the last definition is Gone.
…there is a workaround.
3.1.1 Slots, Galore!
Any coding standard which insists on syntactic clarity at the expense of algorithmic clarity should be rewritten. If your employer fires you for using this trick, tell them that repeatedly as the security staff drag you out of the building.
- Simon Tatham, on Coroutines in C.
it's time for me to introduce my notion of slots. it is a hack i created specifically to cope with this situation. instead of a single macro that refers to it's previous iterations, we have several macros in a chain that each refer to the previous macro in that chain.
#define new_list_0(...) (void) 0 #define new_list_1(...) new_list_0(__VA_ARGS__) #define new_list_2(...) new_list_1(__VA_ARGS__) #define new_list_3(...) new_list_2(__VA_ARGS__) #define new_list_4(...) new_list_3(__VA_ARGS__) #define new_list(...) new_list_4(__VA_ARGS__) #define new_list_N 1
new_list_N
's job is to tell us which macro to overwrite next. we want to
overwrite new_list_1
, then new_list_2
, then new_list_3
, then
new_list_4
, then report an error message.
now, we can't write #define PP_PASTE2(new_list_, new_list_N) ...
no matter
how hard we try, because the preprocessor doesn't recognize directives in
that order.
instead, we must do something significantly more mind numbing.
#if new_list_N == 1 typedef T new_list_1_T; typedef list(T) new_list_1_listT; #undef new_list_1 #define new_list_1(...) \ _Generic((FIRST(__VA_ARGS__,~)), \ new_list_1_T: \ new_list_T(new_list_1_listT, __VA_ARGS__), \ default: \ new_list_0(__VA_ARGS__)) #undef new_list_N #define new_list_N 2 #elif new_list_N == 2 typedef T new_list_2_T; typedef list(T) new_list_2_listT; #undef new_list_2 #define new_list_2(...) \ _Generic((FIRST(__VA_ARGS__,~)), \ new_list_2_T: \ new_list_T(new_list_2_listT, __VA_ARGS__), \ default: \ new_list_1(__VA_ARGS__)) #undef new_list_N #define new_list_N 3 #elif new_list_N == 3 typedef T new_list_3_T; typedef list(T) new_list_3_listT; #undef new_list_3 #define new_list_3(...) \ _Generic((FIRST(__VA_ARGS__,~)), \ new_list_3_T: \ new_list_T(new_list_3_listT, __VA_ARGS__), \ default: \ new_list_2(__VA_ARGS__)) #undef new_list_N #define new_list_N 4 #elif new_list_N == 4 typedef T new_list_4_T; typedef list(T) new_list_4_listT; #undef new_list_4 #define new_list_4(...) \ _Generic((FIRST(__VA_ARGS__,~)), \ new_list_4_T: \ new_list_T(new_list_4_listT, __VA_ARGS__), \ default: \ new_list_3(__VA_ARGS__)) #undef new_list_N #define new_list_N 5 #else #error "ran out of slots for =new_list_N=" #endif
horrific. we're writing the preprocessor equivalent of is_even
.
hey, remember that program we used to generate the cases for new_list
?
we're gonna have to alter it a tad bit.
#include <stdio.h> int main(void) { for (int i = 2; i < 64; i++) printf("#define new_int_list_T_%d(T, x, ...) " "((T) { .car = (x)," " .cdr = &new_int_list_T_%d(T, __VA_ARGS__) })\n", i, i - 1 ); }
then we can just add…
#define new_list_T(T, ...) PP_OVERLOAD(new_list_T, T, __VA_ARGS__) #define new_list_T_1(...) *(void *) 0
and we are Done!
#define T int #include "list.def" list(int) l = new_list(1, 2, 3); for (int i in l) /* Gotcha! */ printf("%d\n", i);
$ cc -w -o main main.c main.c:8:3: error: use of undeclared identifier 'int_list' for (int i in l) $
whoops. our for
macro still expects the only type it accepts to be
int_list
.
we unfortunately need to use More Slots, and we can't do this naively:
_Generic((1), int: if (1) printf("no statements allowed!\n"))
$ cc -w -o main main.c main.c:7:22: error: expected expression _Generic((1), int: if (1) printf("no statements allowed!\n"))
so. we need a workaround. again. i choose… void *
.
for (void *lp = &(list); lp; lp && (lp = lp->cdr)) \
now the code doesn't work, because lp->cdr
doesn't make any sense.
however:
for (void *lp = &(list); lp; lp && (lp = CAST_AS_ALIKE_LIST(lp, (list))->cdr))
now, CAST_AS_ALIKE_LIST(...)
is an expression. and can use _Generic
.
we need More Slots
#ifndef CAST_AS_ALIKE_LIST #define CAST_AS_ALIKE_LIST_0(from, to) from #define CAST_AS_ALIKE_LIST_1(...) CAST_AS_ALIKE_LIST_0(__VA_ARGS__) #define CAST_AS_ALIKE_LIST_2(...) CAST_AS_ALIKE_LIST_1(__VA_ARGS__) #define CAST_AS_ALIKE_LIST_3(...) CAST_AS_ALIKE_LIST_2(__VA_ARGS__) #define CAST_AS_ALIKE_LIST_4(...) CAST_AS_ALIKE_LIST_3(__VA_ARGS__) #define CAST_AS_ALIKE_LIST(...) CAST_AS_ALIKE_LIST_4(__VA_ARGS__) #define CAST_AS_ALIKE_LIST_N 1 #endif #if CAST_AS_ALIKE_LIST_N == 1 typedef list(T) CAST_AS_ALIKE_LIST_1_T; #undef CAST_AS_ALIKE_LIST_1 #define CAST_AS_ALIKE_LIST_1(from, to) \ _Generic((to), \ CAST_AS_ALIKE_LIST_1_T: \ ((CAST_AS_ALIKE_LIST_1_T *) from), \ default: CAST_AS_ALIKE_LIST_0(from, to)) #undef CAST_AS_ALIKE_LIST_N #define CAST_AS_ALIKE_LIST_N 2 #elif CAST_AS_ALIKE_LIST_N == 2 typedef list(T) CAST_AS_ALIKE_LIST_2_T; #undef CAST_AS_ALIKE_LIST_2 #define CAST_AS_ALIKE_LIST_2(from, to) \ _Generic((to), \ CAST_AS_ALIKE_LIST_2_T: \ ((CAST_AS_ALIKE_LIST_2_T *) from), \ default: CAST_AS_ALIKE_LIST_1(from, to)) #undef CAST_AS_ALIKE_LIST_N #define CAST_AS_ALIKE_LIST_N 3 #elif CAST_AS_ALIKE_LIST_N == 3 typedef list(T) CAST_AS_ALIKE_LIST_3_T; #undef CAST_AS_ALIKE_LIST_3 #define CAST_AS_ALIKE_LIST_3(from, to) \ _Generic((to), \ CAST_AS_ALIKE_LIST_3_T: \ ((CAST_AS_ALIKE_LIST_3_T *) from), \ default: CAST_AS_ALIKE_LIST_2(from, to)) #undef CAST_AS_ALIKE_LIST_N #define CAST_AS_ALIKE_LIST_N 4 #elif CAST_AS_ALIKE_LIST_N == 4 typedef list(T) CAST_AS_ALIKE_LIST_4_T; #undef CAST_AS_ALIKE_LIST_4 #define CAST_AS_ALIKE_LIST_4(from, to) \ _Generic((to), \ CAST_AS_ALIKE_LIST_4_T: \ ((CAST_AS_ALIKE_LIST_4_T *) from), \ default: CAST_AS_ALIKE_LIST_3(from, to)) #undef CAST_AS_ALIKE_LIST_N #define CAST_AS_ALIKE_LIST_N 5 #else #error "ran out of slots for =CAST_AS_ALIKE_LIST_N=" #endif
okay. we're done. i think. i hope.
now we just need to rewrite the FOREACH_
macro a bit…
#define FOREACH_(declaration, list) \ for (void *lp = &(list); \ lp; \ lp && (lp = CAST_AS_ALIKE_LIST(lp, (list))->cdr)) \ for (int temp = 1; \ temp; \ temp && (lp = (void *) 0), \ temp = 0) \ for (declaration = CAST_AS_ALIKE_LIST(lp, (list))->car; \ temp; \ temp = 0)
now, we can do this:
#define T int #include "list.def" for (int i in new_list(1, 2, 3)) printf("%d\n", i);
let's make a small header file to just include some common lists…
/* list.h */ #define T char #include "list.def" #define T short #include "list.def" #define T long #include "list.def" #define T char * #include "list.def" /* Gotcha! */
$ cc -w -o main main.c ./list.def:47:8: error: pasting formed '*_', an invalid preprocessing token struct PP_CAT3(_list_, T, _) {
whoops. we forgot about types that aren't just a single identifier. … too bad!
/* list.h */ #define T char #include "list.def" #define T short #include "list.def" #define T long #include "list.def" typedef char *str; #define T str #include "list.def" /* main.c */ for (str s in new_list("Hell", "o, Wo", "rld!\n")) printf("%s", s);
$ cc -w -o main main.c $ ./main Hello, World! $
excellent…whoops forgot about int
/* list.h */ #define T char #include "list.def" #define T short #include "list.def" #define T int #include "list.def" #define T long #include "list.def" typedef char *str; #define T str #include "list.def" /* Gotcha! */
$ cc -w -o main main.c In file included from ./main.c:13: ./list.def:176:8: error: "ran out of slots for =new_list_N=" ./list.def:229:8: error: "ran out of slots for =CAST_AS_ALIKE_LIST_N=" $
okay. we're done. i think. i hope.
- Otesunki, on Think Like A Warlo.
great, now we need to think about automatically generating Slots Hell.
here's what i came up with:
#include <stdio.h> #define SLOTS 64 int main(void) { printf("#ifndef new_list\n"); printf("#define new_list_N 1\n"); printf("#define new_list(...) new_list_%d(__VA_ARGS__)\n", SLOTS); for (int i = SLOTS; i > 0; i--) printf("#define new_list_%d(...) new_list_%d(__VA_ARGS__)\n", i, i - 1); printf("#define new_list_0(...) (void) 0\n"); printf("#define new_list_T(T, ...) " "PP_OVERLOAD(new_list_T, T, __VA_ARGS__)\n"); printf("#define new_list_T_1(...) *(void *) 0\n"); for (int i = 2; i < SLOTS; i++) printf("#define new_list_T_%d(T, x, ...) " "((T) { .car = (x)," " .cdr = &new_list_T_%d(T, __VA_ARGS__) })\n", i, i - 1 ); printf("#endif\n"); printf("\n"); printf("#if 0\n"); for (int i = 1; i <= SLOTS; i++) printf("#elif new_list_N == %d\n", i), printf("#undef new_list_N\n"), printf("#define new_list_N %d\n", i + 1), printf("typedef T new_list_%d_T;\n", i), printf("typedef list(T) new_list_%d_listT;\n", i), printf("#undef new_list_%d\n", i), printf("#define new_list_%d(...) \\\n", i), printf(" _Generic((FIRST(__VA_ARGS__,~)), \\\n"), printf(" new_list_%d_T: \\\n", i), printf(" new_list_T(new_list_%d_listT, __VA_ARGS__), \\\n", i), printf(" default: \\\n"), printf(" new_list_%d(__VA_ARGS__)) \n", i - 1); printf("#else\n"); printf("#error \"ran out of slots for =new_list_N=\"\n"); printf("#endif\n"); printf("\n"); printf("#ifndef CAST_AS_ALIKE_LIST\n"); printf("#define CAST_AS_ALIKE_LIST_N 1\n"); printf("#define CAST_AS_ALIKE_LIST(...) " "CAST_AS_ALIKE_LIST_%d(__VA_ARGS__)\n", SLOTS); for (int i = SLOTS; i > 0; i--) printf("#define CAST_AS_ALIKE_LIST_%d(...) " "CAST_AS_ALIKE_LIST_%d(__VA_ARGS__)\n", i, i - 1); printf("#define CAST_AS_ALIKE_LIST_0(from, to) (from)\n"); printf("#endif\n"); printf("\n"); printf("#if 0\n"); for (int i = 1; i <= SLOTS; i++) printf("#elif CAST_AS_ALIKE_LIST_N == %d\n", i), printf("#undef CAST_AS_ALIKE_LIST_N\n"), printf("#define CAST_AS_ALIKE_LIST_N %d\n", i + 1), printf("typedef list(T) CAST_AS_ALIKE_LIST_%d_T;\n", i), printf("#undef CAST_AS_ALIKE_LIST_%d\n", i), printf("#define CAST_AS_ALIKE_LIST_%d(from, to) \\\n", i), printf(" _Generic((to), \\\n"), printf(" CAST_AS_ALIKE_LIST_%d_T: \\\n", i), printf(" ((CAST_AS_ALIKE_LIST_%d_T *) (from)), \\\n", i), printf(" default: \\\n"), printf(" CAST_AS_ALIKE_LIST_%d(from, to)) \n", i - 1); printf("#else\n"); printf("#error \"ran out of slots for =CAST_AS_ALIKE_LIST=\"\n"); printf("#endif\n"); }
$ cat main.c #include <stdio.h> #include "list.h" int main(void) { for (char *s in new_list("Hell", "o, Wo", "rld!\n")) printf("%s", s); return 0; } $ cc -w -o main main.c $ ./main Hello, World!
3.2 The Double-Generic Hack
when compiling the same code without warnings disabled, gcc
spits out
a sea of diagnostics repeating "integer from pointer without a cast". this
is because C's _Generic
happens to be Really Stupid And Dumb.
typedef struct { int _ : 1 } A; typedef struct { int _ : 1 } B; void accepts_a(A a) { (void) a; } void accepts_b(B b) { (void) b; } A a; _Generic(a, A: accepts_a(a), B: accepts_b(a));
$ cc -w -o main main.c ./main.c:11:45: error: passing 'A' to parameter of incompatible type 'B'
yes. _Generic
still type checks the failing branch. it's not like c++'s
if constexpr
, not even remotely.
however, there IS a workaround:
_Generic(a, A: accepts_a(_Generic(a, A: a, default: (A){0})), B: accepts_b(_Generic(a, B: a, default: (B){0})))
just. the amount. of brain-damage on display here. is. immeasurable.
hey, remember that program we used to generate the cases for new_list
?
we're gonna have to alter it a tad bit.
#include <stdio.h> int main(void) { for (int i = 3; i < 64; i++) printf("#define new_int_list_T_%d(T, U, x, ...) " "((T) { .car = _Generic((x), U: (x), default: (U){0})," " .cdr = &new_int_list_T_%d(T, U, __VA_ARGS__) })\n", i, i - 1 ); }
we also need to change the code that calls this function to pass both T
and list(T)
through typedef
s.
there are a couple of other warnings that are emitted by this code, fixing
the rest is left as an exercise to the reader
4 Conclusion
when writing cursed c preprocessor hacks, the best hacks use the preprocessor as little as possible, and offload everything else to the later stages of the compiler.