C programming style

Daniel Andersson
2024-08-06

After using C as my primary programming language for a few more years since modern C. I have made some more changes to my coding style and habits when using C.

Historically the byte size of int and other basic types depended on the platform the code was compiled on. Nowadays, pretty much all relevant platforms have 4 byte ints. C99 includes the stdint.h header, where the bit size is explicit. For convenience I prefer to shorten them with typedefs.

#include <stdint.h>
typedef uint64_t u64; typedef uint32_t u32; typedef uint16_t u16; typedef uint8_t u8;
typedef int64_t s64; typedef int32_t s32; typedef int16_t s16; typedef int8_t s8;
typedef float f32; typedef double f64;

One of the most important changes I’ve made is to not use null terminated strings. Using them improperly is the cause of many bugs and exploits. I instead use something like

typedef struct str8 str8;
struct str8 {
    u8 *str;
    u64 count;
};
#define str8c(cstr) (str8){.str=(u8 *)cstr, .count=strlen(cstr)}

You can even allocate an extra byte at the end of the string and be compatible with things like printf that expect null terminated strings. Using str8 in combination with arena/bump, frame and/or scratch allocators, make strings a lot nicer to use in C.

str8 structs can just be passed around by value. Treating strings backed by an arena as immutable makes substringing practically free. You just set the pointer and count to where they need to be, no need to allocate anything. A caveat is that the null terminated string fallback is lost and may lead to confusing bugs.

If I were to have someone else work on a C codebase I am lead on, then I would go so far as to ban all string functions, like strncpy, from being used. I would also ban the uses of C locales. That is legacy stuff that should not be used, everything is UTF-8 now. setlocale() sets global state which is not good for threading. It may break libraries, since setting a locale changes the way some standard functions work. One of those is whether numbers should use “,” or “.” as decimal separator. That breaks basic string conversions, interacting with file formats like CSV, and parsing of numbers.

To enforce the ban of the functions, you would put the BANNED macro in it’s own header file with the banned functions. To shorten and simplify a little, it would look like

#include <string.h>
#include <stdio.h>

#define BANNED(func) func##_is_a_banned_function

#undef strcpy
#define strcpy(x,y) BANNED(strcpy)

int main(void) {
  char *hello = "Hello World!";
  char tmp[64] ={0};
  strcpy(tmp, hello);
  printf("%s\n", tmp);
  return 0;
}

When that is compiled, you get error: use of undeclared identifier 'strcpy_is_a_banned_function' and some additional notes about the macro expansion.

If files are to be read from disk in it’s entirety, I’ll still use the str8 struct to store the contents, even if it is binary data. This is in order to minimize the amount of structs used.

str8 os_read_file(arena *a, str8 path);

The alternative is to have another struct for a memory buffer, but I do not think it is worth it.

typedef struct memory_buffer memory_buffer;
struct memory_buffer {
    u8 *buffer;
    u64 size;
};

I also have my own assert macro. Having your own is much better for ergonomics, where the debugger ends up at the assert and not 5 function calls down in the standard library.

I mark all functions static, except for entrypoints binaries and APIs in libraries. This saves link time and doesn’t pollute the exported symbols.

If the codebase is less than 500K lines of code, and I don’t really know where the limit is, unity builds are preferable. Just including everything in one place in the correct order simplifies things a lot and speeds up compiling and iterating. C++23’s vector header is about 28000 lines of code. Having 50 translation units all including the common vector header slows down compilation and iteration. I’m ignoring the time for template instatiations here and I’m just talking about ‘compiling’ the same code over and over again.

#include <stdio.h>
//...
#include <math.h>

#include "base.h"
#include "os.h"
// ...
#include "project.h"

#define STB_IMAGE_IMPLEMENTATION
#include "stb_image.h"

#include "base.c"
#include "os.c"
// ...
#include "project.c"

int main(int argc, char *argv[])
{
  project_entrypoint();
  return 0;
}