Anyone who's written C knows that full ISO C standard-adhering code is an impractical rarity. Most real world C code out there relies on non-standard behaviors and language extensions to varying extents, and a lot of this isn't for extra features, but just to work around bugs and gaps in different compilers and libraries. A lot of codebases will try somewhat to support various environments, mostly through the use of preprocessor checks and guards, but these attempts are finicky at best and straight up broken at worst.
I have ran into many of these situations while working on my C compiler, so here's a small list of some of them.
The system's C library headers is the first 'obstacle' for a C compiler
aspiring to be useful. If you can't preprocess and parse <stdio.h>, you
won't get past hello world. Because I use GNU/Linux, that means glibc. Now, to
their credit, glibc does try to retain compatibility of its headers on non-GCC
compilers. In the monstrosity that is sys/cdefs.h, which is indirectly
included by every libc header, they use all kinds of preprocessor checks for
compiler-predefined macros to determine what kinds of compiler extensions are
supported, and #define things away when they aren't.
Unfortunately this is just broken sometimes. For example, on Linux struct epoll_event from sys/epoll.h is a packed
struct,
which uses the GNU __attribute__((packed)). Because this changes the struct
layout (on 64 bits), you can't ignore it without breaking ABI. So okay, say you implement
support for __attribute__((packed)) in your compiler. But this isn't enough,
because the aforementioned sys/cdefs.h contains this code:
/* GCC, clang, and compatible compilers have various useful declarations
that can be made with the '__attribute__' syntax. All of the ways we use
this do fine if they are omitted for compilers that don't understand it. */
#if !(defined __GNUC__ || defined __clang__ || defined __TINYC__)
# define __attribute__(xyz) /* Ignore */
#endif
If you aren't gcc, clang, or tcc, tough luck.
Although, the epoll header is Linux-specific, so you could argue
that applying C standards portability criteria is not fair.
Some C headers are supposed to be provided by the compiler because they should
be present even on freestanding implementations, and rely on
compiler-internal definitions. In my computer for example, these live in
/usr/lib/gcc/x86_64-pc-linux-gnu/16.1.1/include/ for GCC and
/usr/lib/clang/22/include/ for clang. These builtin headers include
stddef.h, stdint.h, limits.h, float.h and more. However, POSIX requires
limits.h to define some POSIX-specific constants in addition to the standard
C constants. So you still need a platform-specific limits.h on top of the compiler's.
glibc's <limits.h> looks like this (abridged):
...
/* If we are not using GNU CC we have to define all the symbols ourself.
Otherwise use gcc's definitions (see below). */
#if !defined __GNUC__ || __GNUC__ < 2
/* We only protect from multiple inclusion here, because all the other
#include's protect themselves, and in GCC 2 we may #include_next through
multiple copies of this file before we get to GCC's. */
# ifndef _LIMITS_H
# define _LIMITS_H 1
/* We don't have #include_next. Define ANSI <limits.h> for standard 32-bit words. */
/* These assume 8-bit `char's, 16-bit `short int's, and 32-bit `int's and `long int's. */
# define CHAR_BIT 8
...
# endif /* limits.h */
#endif /* GCC 2. */
#endif /* !_LIBC_LIMITS_H_ */
/* Get the compiler's limits.h, which defines almost all the ISO constants.
We put this #include_next outside the double inclusion check because
it should be possible to include this file more than once and still get
the definitions from gcc's header. */
#if defined __GNUC__ && !defined _GCC_LIMITS_H_
/* `_GCC_LIMITS_H_' is what GCC's file defines. */
# include_next <limits.h>
#endif
/* The <limits.h> files in some gcc versions don't define LLONG_MIN, LLONG_MAX,
and ULLONG_MAX. */
#if defined __USE_ISOC99 && defined __GNUC__
# ifndef LLONG_MIN
# define LLONG_MIN (-LLONG_MAX-1)
# endif
...
#endif
#ifdef __USE_POSIX
/* POSIX adds things to <limits.h>. */
# include <bits/posix1_lim.h>
#endif
...
It depends on the gcc-specific builtin limits.h to define some macros to work correctly,
on top of the use of the #include_next extension. Even clang has to work around this silliness.
SDL_endian.h has a goofy bit of feature detection for its byteswapping functions. The purpose is to use compiler builtins or inline assembly whenever possible, and only fall back to portable generic bitwise operations as a last resort. But the way it goes about this is with the following logic:
- if (GCC or clang) and
__has_builtin(__builtin_bswapX)→ use builtins - else if (msvc >= v8.0) -> use msvc intrinsic #pragma
- else if defined (ISA-specific macro like
__x86_64__) -> use inline assembly - else -> use generic impl with regular bitwise operations
This means that if you aren't GCC or clang, but you define the ISA-specific
predefined macro (for good reasons), it will try to use (extended) inline
assembly, even if you have the bswap builtins and provide the __has_builtin
special operator. Seems a little odd to expect an unknown compiler to support
GCC-style extended inline assembly.
Some OpenBSD headers include inline function definitions intended
to be used optionally by the compiler when optimizing. These are defined with the
macro __only_inline, e.g.:
__only_inline int sigemptyset(sigset_t *__set)
{
*__set = 0;
return (0);
}
and are supposed to fallback to the 'real' external symbol when/if the compiler
doesn't actually inline it. In other words, an inline function with extern
linkage. These are, generally, a mess: though they are specified in C99, the
standard behavior conflicts with the pre-C99 non-standard GCC behavior (default
prior to 4.2). In short, the inline definition in the header should use extern inline with the function body, and this will not emit the actual exported
function, and in the translation unit, one should declare the function with
just inline to export its definition there. To further add to the confusion,
the meaning of inline differs between C++ and C. Refer to this good article by
Youtao Guo.
So, OpenBSD relies on GCC's inline semantics and to paper over the GCC version
discrepancy, the __only_inline macro in sys/cdefs.h uses an
explicit __attribute__ on newer GCC versions to specify the old gnu89 inline
semantics. But on non-GNU compilers, it gets defined to static linkage,
which breaks because it ends up declaring/defining functions with conflicting linkage.
Luckily, they respect a macro, _ANSI_LIBRARY that when defined entirely omits
the use of these broken __only_inline definitions in standard headers like
signal.h. So you don't get the "optimized versions" (not that it probably
makes a big difference), but at least it works.
I also ran across Gnulib's compatibility code
for extern inline when building Guile and nano, which highlights all the broken
and weird implementations of this corner case of C. See extern-inline.m4 for comments
explaining it but here's an excerpt:
#if (((defined __APPLE__ && defined __MACH__) \
|| defined __DragonFly__ || defined __FreeBSD__) \
&& (defined HAVE___HEADER_INLINE \
? (defined __cplusplus && defined __GNUC_STDC_INLINE__ \
&& ! defined __clang__) \
: ((! defined _DONT_USE_CTYPE_INLINE_ \
&& (defined __GNUC__ || defined __cplusplus)) \
|| (defined _FORTIFY_SOURCE && 0 < _FORTIFY_SOURCE \
&& defined __GNUC__ && ! defined __cplusplus))))
# define _GL_EXTERN_INLINE_STDHEADER_BUG
#endif
#if ((__GNUC__ \
? (defined __GNUC_STDC_INLINE__ && __GNUC_STDC_INLINE__ \
&& !defined __PCC__) \
: (199901L <= __STDC_VERSION__ \
&& !defined __HP_cc \
&& !defined __PGI \
&& !(defined __SUNPRO_C && __STDC__))) \
&& !defined _GL_EXTERN_INLINE_STDHEADER_BUG)
# define _GL_INLINE inline
# define _GL_EXTERN_INLINE extern inline
# define _GL_EXTERN_INLINE_IN_USE
#elif (2 < __GNUC__ + (7 <= __GNUC_MINOR__) && !defined __STRICT_ANSI__ \
&& !defined __PCC__ \
&& !defined _GL_EXTERN_INLINE_STDHEADER_BUG)
# if defined __GNUC_GNU_INLINE__ && __GNUC_GNU_INLINE__
/* __gnu_inline__ suppresses a GCC 4.2 diagnostic. */
# define _GL_INLINE extern inline __attribute__ ((__gnu_inline__))
# else
# define _GL_INLINE extern inline
# endif
# define _GL_EXTERN_INLINE extern
# define _GL_EXTERN_INLINE_IN_USE
#else
# define _GL_INLINE _GL_UNUSED static
# define _GL_EXTERN_INLINE _GL_UNUSED static
#endif
...okay. How lovely.
bionic is Android's
libc. In an original twist, its headers heavily assume clang instead of gcc.
It's full of clang-specific extensions like _Nonnull,
_Null_unspecified, for nullability checks, and other things. Luckily
not difficult to #define those away with command-line flags.
Also the only reason I've ran into this is because I've been using my Android phone with Termux as a native aarch64 development environment (lol) and that has bionic headers.
I realize I've mostly ended up writing about libc headers, however I could go on with countless examples but I have exams soon and I have been procrastinating enough.
While it is very annoying to deal with the way many open source projects rely on compiler-specific non-standard extensions and behaviors for non-essential things, it's not fair either to ask of every developer to test their C code with different compilers, including obscure or small ones. C portability is hard enough as is. From the point of view of someone writing a compiler, the possible solutions are:
-
Try to patch these incompatibilities upstream.
-
Amass enough popularity to warrant developers adding dedicated #ifdef checks and testing on your compiler by default.
-
Deal with these "downstream", perhaps distributing your patches.
-
Pretend to be (some version of) GCC and implement its extensions.
(1) seems like a losing battle, (3) is the easiest, (4) is the realistic
(though arduous) way to support a large number of codebases with minimal
disruption to users of your compiler and developers of said codebases. For
example, clang defines __GNUC__=4 (and __GNUC_MINOR__=2,
__GNUC_PATCHLEVEL__=1) to claim compatibility with GCC 4.2.1. Although by
this point clang is at (2), it took significant effort to e.g. get clang to
compile the Linux kernel (needing patches in both projects).
Of course, a problem with (4) is that there are also many codebases that will
check for #ifdef __GNUC__ and freely use all kinds of newer GCC extensions if that
macro is defined, without version checks. So you end up playing catch-up, which is
one reason why clang doesn't bump its __GNUC__ macros even though it supports
GNU extensions newer than those of 4.2.1 (see this discussion).
Ideally, feature test macros like __has_builtin, __has_feature,
__has_attribute, and even the standard ones like __STDC_NO_VLA__
would be more widely used instead of compiler-specific guards and version checks.
For now, the GCC/clang quasi-duopoly is the status quo in *NIX land, for better or worse. Kudos to the developers of smaller independent C compilers: tcc, cproc, scc, vbcc, nwcc, kefir, and more.