C Growable Arrays: In Depth

by: Ethan McCue

An extremely common question people have when using C for the first time is how to make an array of elements that can grow over time.

I know this is a common question because one older post on this website where I explained the concept (badly) gets tons of organic traffic.

It's not a bad question either. Nearly every language you might be coming at C from has an equivalent.

And, ignoring primacy, C classes often have students make data structures for their assignments.

So I figure it might be useful to at least one person to give a walkthrough of how that data structure works in the C world.

Just keep in mind that I am not a professional C programmer. If I get anything wrong or there is something you wish I mentioned, feel free to mention it in the comments below or wherever. I'll make corrections.

Arrays

An array is a fixed size collection of data.

int numbers[] = {
    1,
    2,
    3
};

Being fixed size means that if an array starts out with enough space for 3 elements, there is no way to make space for a 4th.

C arrays are also, more or less, equivalent to pointers that just so happen to point to the start of a chunk of memory. So whenever you see something like int[] in code you can mentally translate that to int*.

Most languages' array-equivalents can have their size queried at runtime. C is a bit special in that there is no way to recover the number of elements in an array after you make it. It is just a pointer to a chunk of memory after all.

This means you have two basic options for being able to figure out the size of an array.

1. Have a sentinel terminate the array

One way to be able to figure out the size of an array is to put a special sentinel value as its last element. Code working with the array can then proceed forward until that special value is reached.

This may or may not be an option depending on the kind of data being stored in an array. The most common use of this actually comes from how C stores strings.

#include <stdio.h>

int main()
{
    char* hello1 = "Hello";
    char hello2[] = { 'H', 'e', 'l', 'l', 'o', '\0' };
    printf("%s\n", hello1);
    printf("%s\n", hello2);

    return 0;
}

"C style strings" are an array of characters terminated by a null character. If you want to find the length of something like this, just keep looping until you get to that terminator.

#include <stdio.h>

int main()
{
    char* hello = "Hello";
    
    int i = 0;
    while (hello[i] != '\0') {
        printf("%c\n", hello[i]);
        i++;
    }
    return 0;
}

An upside to this approach is that its simple to understand. A downside is that you need to go through every element in the array to find its size, which is a pain.

2. Store it when you make the Array

If we don't want to have the null terminator we need to store a number.

One way to do this is to just manually count out how big an array is.

#include <stdio.h>

int main()
{
    int numbers[] = {
        6,
        4,
        7
    };
    
    // I counted it with my eyes
    int numbers_size = 3;
    
    int i = 0;
    while (i < numbers_size) {
        printf("%d\n", numbers[i]);
        i++;
    }
    return 0;
}

If when you initialize your array you write the number of elements directly, you can make use of sizeof to calculate the size.

#include <stdio.h>

int main()
{
    int numbers[3] = {
        6,
        4,
        7
    };
    
    // Because of the literal [3] above, C can figure out
    // how many elements there are by dividing the total
    // size of the array by the size of an individual
    // element.
    int numbers_size = sizeof(numbers) / sizeof(numbers[0]);
    
    int i = 0;
    while (i < numbers_size) {
        printf("%d\n", numbers[i]);
        i++;
    }
    return 0;
}

You then need to handle passing that size around with the array whenever you give it to a function that takes an array as an argument. A good deal of the C standard library works like this.

size_t

Small digression. When you make a variable to store an index into array or that stores the size of an array you are intended to use size_t.

If you don't it seems like its usually "fine," but I wouldn't risk the wrath of the undefined behavior demons.

To have size_t be available you should put #include <stddef.h> at the top of your program.

#include <stddef.h>
#include <stdio.h>

int main()
{
    int numbers[3] = {
        6,
        4,
        7
    };
    
    size_t numbers_size = sizeof(numbers) / sizeof(numbers[0]);
    
    size_t i = 0;
    while (i < numbers_size) {
        printf("%d\n", numbers[i]);
        i++;
    }
    return 0;
}

Heap Allocation

At runtime, you can get an arbitrarily large block of memory in various ways.

The most commonly known is malloc. You give it a size then it gives you a pointer to the start of that memory. You need #include <stdlib.h> to use it.

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h> 

int main()
{
    size_t numbers_size = 3;
    int* numbers = malloc(sizeof(int) * numbers_size);
    numbers[0] = 6;
    numbers[1] = 4;
    numbers[2] = 7;
    
    
    size_t i = 0;
    while (i < numbers_size) {
        printf("%d\n", numbers[i]);
        i++;
    }
    return 0;
}

The one we will use is calloc. It works mostly the same as its cousin malloc with two major differences.

The first is that you don't give it the full size of the array you want. You give it the size of each element and the number of elements you want seperately.

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h> 

int main()
{
    size_t numbers_size = 3;
    int* numbers = calloc(sizeof(int), numbers_size);
    numbers[0] = 6;
    numbers[1] = 4;
    numbers[2] = 7;
    
    
    size_t i = 0;
    while (i < numbers_size) {
        printf("%d\n", numbers[i]);
        i++;
    }
    return 0;
}

The second is that the memory returned is already "zeroed." This means that you know that every element is in its zero-valued state. So for int it will literally be 0, _Bools will be false, pointers will be NULL, etc.

Often that doesn't matter but, because there isn't "uninitialized" memory with random data in it, it feels more predictable.

For both approaches you need to later free that allocated memory. You will be technically exempt from needing to do this if your program doesn't run for long enough to run out of memory. I think it is best to be a "good citizen" and free your memory regardless.

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h> 

int main()
{
    size_t numbers_size = 3;
    int* numbers = calloc(sizeof(int), numbers_size);
    numbers[0] = 6;
    numbers[1] = 4;
    numbers[2] = 7;
    
    
    size_t i = 0;
    while (i < numbers_size) {
        printf("%d\n", numbers[i]);
        i++;
    }
    
    free(numbers);
    
    return 0;
}

Technically speaking malloc, calloc, etc. can fail if the system is out of memory. We are going to ignore that possibility for the rest of this, but the lower level software you write the larger chance you will need to care about very limited memory scenarios.

Single Type Growable Arrays

The basic concept of a growable array is to group three pieces of information. A pointer to an array of things, the number of elements allocated for that array, and the number of elements "actually" in the array.

struct GrowableIntArray {
    int* data;
    size_t allocated;
    size_t size;
}

So for a new array with nothing in it, the data pointer would be null and both numbers would be 0.

struct GrowableIntArray growable_int_array_empty() {
    struct GrowableIntArray empty = {
       .data = NULL,
       .allocated = 0,
       .size = 0
    };
    
    return empty;
}

To add an element to the array, you check if adding the element would make the size of the array larger than what was allocated.

If it won't, you set an element in your data array and bump the size.

If it will, you need to make a new array that is bigger than the last one. How much bigger is more art than science, but generally people find success allocating around twice as many elements as were there before.

That sounds crazy, but at worst you are only wasting half your memory. That's not that bad in the scope of things.

Then you copy over all the elements from the last array and free the old one.

void growable_int_array_add(
    struct GrowableIntArray* array, 
    int value
) {
    // If we wouldn't have enough room
    if (array->size + 1 > array->allocated) {
       // Double the size of the last array
       size_t new_allocated;
       if (array->size == 0) {
           new_allocated = 2;
       }
       else {
           new_allocated = array->size * 2;
       }
       
       // Make a new array that size
       int* new_data = calloc(sizeof(int), new_allocated);
       int* old_data = array->data;
       
       // Copy all the old elements to it
       if (old_data != NULL) {
           for (size_t i = 0; i < array->size; i++) {
               new_data[i] = old_data[i];
           }
       }
       
       // Then free the old array
       free(old_data);
       
       // And patch up the pointers 
       array->data = new_data;
       array->allocated = new_allocated;
    }
    
    // And put in the new element
    array->data[array->size] = value;
    array->size++;
}

Which is a chunky function, but now you should be good to go on making something which you can use as an array but which dynamically grows as elements are added.

int main()
{
    struct GrowableIntArray numbers 
        = growable_int_array_empty();
        
    growable_int_array_add(&numbers, 6);
    growable_int_array_add(&numbers, 4);
    growable_int_array_add(&numbers, 7);

    
    size_t i = 0;
    while (i < numbers.size) {
        printf("%d\n", numbers.data[i]);
        i++;
    }
    return 0;
}

From there it's all a matter of personal taste. Many would want to implement their own growable_int_array_size and growable_int_array_get. Both of these are relatively straight forward and useful if your goal is to avoid accessing struct members directly

size_t growable_int_array_size(
    struct GrowableIntArray* array
) {
    return array->size;
}


int growable_int_array_get(
    struct GrowableIntArray* array, 
    size_t i
) {
    // You can do precondition checks and crash early if someone
    // tries to out of bounds if you want.
    return array->data[i];
}
int main()
{
    struct GrowableIntArray numbers 
        = growable_int_array_empty();
        
    growable_int_array_add(&numbers, 6);
    growable_int_array_add(&numbers, 4);
    growable_int_array_add(&numbers, 7);

    
    size_t i = 0;
    while (i < growable_int_array_size(&numbers)) {
        printf("%d\n", growable_int_array_get(&numbers, i));
        i++;
    }
    return 0;
}

But all of this has a major flaw. Do you see it?

It only works with ints! If you want to have a growable array of longs or Positions or whatever, you need to copy and paste all of this code, change the types around, and make brand-new functions.

What we want is the ability to write code for a growable array once and then have that work for any kind of data we want to store. That gives leaves us with two options.

  1. Make a growable array that can be used for anything at runtime
  2. Make a growable array that can be specialized for anything at compile-time.

Runtime Generic Growable Arrays

What do an int, a char and a struct Position have in common? Nothing. Save some really strange layout choices by a compiler, all of these data types require different amounts of memory.

What do an int*, a char*, and a struct Position* have in common? Turns out all of them can be safely converted to and from a void*.

#include <stdio.h>

int main()
{
    int eight = 8;
    
    int* eightPointer = &eight;
    void* voidPointer = (void*) eightPointer;
    eightPointer = (int*) voidPointer;
    
    printf("%d\n", *eightPointer);

    return 0;
}

A void* is a pointer to "something." The C compiler forgets what kind of information is actually stored in it. All pointers in C have the same size, so now we have our way of storing anything.

struct GrowableArray {
    void** data;
    size_t allocated;
    size_t size;
}

At first, it might seem like we can just do that and find+replace int with void* in the code from before. And you'd be right. Just be aware that things which were once int* will become void**. A pointer to an array of void pointers.

void growable_array_add(
    struct GrowableArray* array, 
    void* value
) {
    // If we wouldn't have enough room
    if (array->size + 1 > array->allocated) {
       // Double the size of the last array
       size_t new_allocated;
       if (array->size == 0) {
           new_allocated = 2;
       }
       else {
           new_allocated = array->size * 2;
       }
       
       // Make a new array that size
       void** new_data = (void**) calloc(
            sizeof(void*), 
            new_allocated
       );
       void** old_data = array->data;
       
       // Copy all the old elements to it
       if (old_data != NULL) {
           for (size_t i = 0; i < array->size; i++) {
               new_data[i] = old_data[i];
           }
       }
       
       // Then free the old array
       free(old_data);
       
       // And patch up the pointers 
       array->data = new_data;
       array->allocated = new_allocated;
    }
    
    // And put in the new element
    array->data[array->size] = value;
    array->size++;
}

Usability

The first problems that will arise are around usability.

To pass a pointer in, it can't be an rvalue. An rvalue is something that should go on the right hand size of an equals sign. That's where the r comes from.

This means that you can't just directly pass in a pointer to an int.

growable_array_add(&numbers, &6);

&6 doesn't have a meaning to C. You need to have constant values first assigned to a variable.

int n = 6;
growable_array_add(&numbers, &n);

This can be annoying to write out, but you might get used to it. Even harder to come to terms with is needing to recover the type of a pointer whenever you get it out.

You need to both convert the void* to an int* or whatever actual type you stored and, if it's something like int, dereference that pointer to get at the actual value.

int value = *((int*) growable_array_get(&numbers, i));

The C compiler doesn't take kindly to mishandled void*s. If you get this wrong you get teleported to Florida.

int main()
{
    struct GrowableArray numbers 
        = growable_array_empty();
      
    int a = 6;
    int b = 4;
    int c = 7;
    
    growable_array_add(&numbers, &a);
    growable_array_add(&numbers, &b);
    growable_array_add(&numbers, &c);

    
    size_t i = 0;
    while (i < growable_array_size(&numbers)) {
        printf("%d\n", *((int*) growable_array_get(&numbers, i)));
        i++;
    }
    return 0;
}

Pointer Lifetimes

Pointers don't all "live" the same amount of time. You can take a pointer to a local variable, but that pointer is only valid so long as you are still within that function.

int* example() {
   int x = 5;
   int* xPointer = &x;
   
   // Can use xPointer freely
   
   // But if you return the pointer out it won't
   // be valid
   return xPointer;
}

You can make pointers that live longer with calloc, but you later need to call free on them.

int* example() {
   int* xPointer = calloc(sizeof(int), 1);
   *xPointer = 5;
   
   // Valid to return, but something eventually
   // should free it.
   return xPointer;
}

This presents a problem for our array of void*s. If all the pointers are pointing to local variables on the stack then your cleanup should just be to call free on array.data.

void growable_array_cleanup(struct GrowableArray array) {
    free(array.data);
}

But if the pointers are pointing to heap allocated memory then someone needs to clean them up later.

void growable_array_cleanup(struct GrowableArray* array) {
    for (size_t i = 0; i < array->size; i++) {
        free(array->data[i]);
    }
    free(array->data);
}

Even worse than that, some things don't just need to be free-ed. They might have been allocated outside the calloc/free system or, like our GrowableArray, they might have some custom cleanup process.

To deal with the sheer variety of situations we need to store how we want to clean up elements in the array itself.

struct GrowableArray {
    void** data;
    size_t allocated;
    size_t size;
    void (*cleanup)(void*);
}

This syntax - void (*cleanup)(void*) - is how you declare a pointer to a function. In this case a function whose return type is void and whose sole argument is a void*. If it looks confusing to you don't worry. It confuses me too.

struct GrowableIntArray growable_array_empty(
    void (*cleanup)(void*)
) {
    return growable_array_empty(NULL);
}

struct GrowableArray growable_array_empty(
    void (*cleanup)(void*)
) {
    struct GrowableArray empty = {
       .data = NULL,
       .allocated = 0,
       .size = 0
       .cleanup = cleanup
    };
    
    return empty;
}

Once you have the cleanup function stored you can make a general cleanup function for the growable array itself.

void growable_array_cleanup(struct GrowableArray* array) {
    if (array->cleanup != NULL) {
        for (size_t i = 0; i < array->size; i++) {
            array->cleanup(array->data[i]);
        }
    }

    free(array->data);
}

If their data is on the stack, you skip trying to free it. If it needs to be free-ed that can be done same as if you need to call special_framework_destroy.

While this might seem like we've solved the problem, notice that no matter what we need to now track when to call a special growable_array_cleanup, each array has an extra pointer of memory, and has to check if array->cleanup != NULL at close. Everything comes at a cost.

Memory Locality

Following the same theme, this sort of structure is forced to have subpar memory locality.

If you were to make an int array, the memory would be laid out like this with each int being directly next to the others.

-------------
| 5 | 4 | 3 |
-------------

When we make an array of int pointers the memory layout looks like this.

-------------------
| ptr | ptr | ptr |
---|-----|-----|---
   V     |     |
   5     V     |
         4     V
               3

Modern CPUs love going through arrays in order. They hate following pointers. This memory layout is almost guaranteed to lead to subpar performance compared to the tightly packed array.

Notice also that we didn't choose this memory layout because we wanted to. We chose it because we didn't want to write out the data structure more than once.

Compile-Time Generic Growable Arrays

If we don't want everything behind a void* we need a perfect vinaigrette of clever and stupid.

Template Headers

The only things that need to change between a growable int array and a growable struct Position array are the struct names, function names, return types, and arguments.

struct GrowableIntArray {
    int* data;
    size_t allocated;
    size_t size;
};

struct GrowablePositionArray {
    struct Position* data;
    size_t allocated;
    size_t size;
}

But we don't need to do that by hand. We have the C preprocessor.

#define GROWABLE_ARRAY_STRUCT struct GrowableIntArray
#define GROWABLE_ARRAY_DATA_POINTER int*

GROWABLE_ARRAY_STRUCT {
    GROWABLE_ARRAY_DATA_POINTER data;
    size_t allocated;
    size_t size;
};

If you've made a C header file before you've probably seen a prelude like this.

#ifndef SOME_FILE_H
#define SOME_FILE_H

// ... CODE FOR HEADER HERE ...

#endif

The purpose of this is so that if more than one file includes the header the code for it only shows up once.

Here we don't want to do that. We want it to be able to be included multiple times in one compilation.

In our growable_array.h we want to assume that GROWABLE_ARRAY_STRUCT, GROWABLE_ARRAY_DATA_POINTER, and whatever else we need defined are already defined by whatever code is including the header. #ifndef and #error can give some basic guardrails for that.

#ifndef GROWABLE_ARRAY_STRUCT
#error "GROWABLE_ARRAY_STRUCT not defined"
#endif

#ifndef GROWABLE_ARRAY_DATA_POINTER
#error "GROWABLE_ARRAY_DATA_POINTER not defined"
#endif

GROWABLE_ARRAY_STRUCT {
    GROWABLE_ARRAY_DATA_POINTER data;
    size_t allocated;
    size_t size;
};

Then we make one header file for each "specialization" we want of the growable array. So for ints we would make growable_int_array.h and put something like the following in it.

#ifndef GROWABLE_INT_ARRAY_H
#define GROWABLE_INT_ARRAY_H

#define GROWABLE_ARRAY_STRUCT struct GrowableIntArray
#define GROWABLE_ARRAY_DATA_POINTER int*

#include "growable_array.h"

#undef GROWABLE_ARRAY_STRUCT
#undef GROWABLE_ARRAY_DATA_POINTER

#endif

First, the normal header prelude. We don't want the growable int array to be defined more than once. Then we define the variables needed for our template header, include that header, and #undef those variables afterward.

The reason we bother with #undef is the same reason this works in the first place. The C preprocessor just does text replacements. When we include growable_array.h it literally spits the contents of that file in place. If we don't #undef a variable we defined it can lead to some head-scratchers compiling some other file.

But now all other code needs to do is include growable_int_array.h to get a growable array for ints. All we need to do to get a growable array for a specific type is do some #defines. Rinse and repeat for any other kind of growable array we want.

Pointer Lifetimes

Using void* might have forced us to always handle the lifetimes of those pointers and using int or whatever else without indirection lets us skip over memory management.

Unfortunately memory management is a fact of life in C. If the kind of thing we are storing needs to be cleaned up we need to track that.

#include <stddef.h>
#include <stdlib.h> 

#ifndef GROWABLE_ARRAY_STRUCT
#error "GROWABLE_ARRAY_STRUCT not defined"
#endif

#ifndef GROWABLE_ARRAY_DATA_POINTER
#error "GROWABLE_ARRAY_DATA_POINTER not defined"
#endif

#ifndef GROWABLE_ARRAY_STRUCT_POINTER
#error "GROWABLE_ARRAY_STRUCT_POINTER not defined"
#endif

#ifndef GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME
#error "GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME not defined"
#endif

GROWABLE_ARRAY_STRUCT {
    GROWABLE_ARRAY_DATA_POINTER data;
    size_t allocated;
    size_t size;
};

void GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME(
    GROWABLE_ARRAY_STRUCT_POINTER array
) {
    #ifdef GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME
        for (size_t i = 0; i < array->size; i++) {
            GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME(array->data[i]);
        }
    #endif

    free(array->data);
}

The good news is that with the template approach that tracking doesn't need to happen at runtime. The bad news is that it needs to happen in the C preprocessor.

Implementor Experience

You might notice that there wouldn't be a warning if GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME was not defined. There also is a dearth of good names to give these things. It's understandable to get tripped up by the difference between GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME and GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME.

Best case scenario if you fill in one of these preprocessor defines wrongly is that your code doesn't compile. Worst case is that you get some insane and hard to debug behavior.

There will also end up being more than a few #defines you need to make. I'm not making use of concatenation for clarity, but even that doesn't trim the number down that far. If we write out some of the other functions you will see how this can be burdensome.

#ifndef GROWABLE_ARRAY_STRUCT
#error "GROWABLE_ARRAY_STRUCT not defined"
#endif

#ifndef GROWABLE_ARRAY_STRUCT_POINTER
#error "GROWABLE_ARRAY_STRUCT_POINTER not defined"
#endif

#ifndef GROWABLE_ARRAY_DATA
#error "GROWABLE_ARRAY_DATA not defined"
#endif

#ifndef GROWABLE_ARRAY_DATA_POINTER
#error "GROWABLE_ARRAY_DATA_POINTER not defined"
#endif

#ifndef GROWABLE_ARRAY_EMPTY_FUNCTION_NAME
#error "GROWABLE_ARRAY_EMPTY_FUNCTION_NAME not defined"
#endif

#ifndef GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME
#error "GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME not defined"
#endif

#ifndef GROWABLE_ARRAY_ADD_FUNCTION_NAME
#error "GROWABLE_ARRAY_ADD_FUNCTION_NAME not defined"
#endif

GROWABLE_ARRAY_STRUCT {
    GROWABLE_ARRAY_DATA_POINTER data;
    size_t allocated;
    size_t size;
};

GROWABLE_ARRAY_STRUCT GROWABLE_ARRAY_EMPTY_FUNCTION_NAME() {
    GROWABLE_ARRAY_STRUCT empty = {
       .data = NULL,
       .allocated = 0,
       .size = 0
    };
    
    return empty;
}

void GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME(
    GROWABLE_ARRAY_STRUCT_POINTER array
) {
    #ifdef GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME
        for (size_t i = 0; i < array->size; i++) {
            GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME(array->data[i]);
        }
    #endif

    free(array->data);
}

void GROWABLE_ARRAY_ADD_FUNCTION_NAME(
    GROWABLE_ARRAY_STRUCT_POINTER array, 
    GROWABLE_ARRAY_DATA value
) {
    if (array->size + 1 > array->allocated) {
       size_t new_allocated;
       if (array->size == 0) {
           new_allocated = 2;
       }
       else {
           new_allocated = array->size * 2;
       }
       
       GROWABLE_ARRAY_DATA_POINTER new_data 
            = (GROWABLE_ARRAY_DATA_POINTER) calloc(sizeof(void*), new_allocated);
       GROWABLE_ARRAY_DATA_POINTER old_data = array->data;
       
       // Copy all the old elements to it
       if (old_data != NULL) {
           for (size_t i = 0; i < array->size; i++) {
               new_data[i] = old_data[i];
           }
       }
       
       free(old_data);

       array->data = new_data;
       array->allocated = new_allocated;
    }
    
    array->data[array->size] = value;
    array->size++;
}

All of which still needs to be handled in each specialization.

#ifndef GROWABLE_INT_ARRAY_H
#define GROWABLE_INT_ARRAY_H

#define GROWABLE_ARRAY_STRUCT struct GrowableIntArray
#define GROWABLE_ARRAY_STRUCT_POINTER struct GrowableIntArray*
#define GROWABLE_ARRAY_DATA int
#define GROWABLE_ARRAY_DATA_POINTER int*
#define GROWABLE_ARRAY_EMPTY_FUNCTION_NAME growable_int_array_empty
#define GROWABLE_ARRAY_ADD_FUNCTION_NAME growable_int_array_add
#define GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME growable_int_array_cleanup

#include "growable_array.h"

#undef GROWABLE_ARRAY_STRUCT
#undef GROWABLE_ARRAY_DATA_POINTER
#undef GROWABLE_ARRAY_DATA
#undef GROWABLE_ARRAY_DATA_POINTER
#undef GROWABLE_ARRAY_EMPTY_FUNCTION_NAME
#undef GROWABLE_ARRAY_ADD_FUNCTION_NAME
#undef GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME

#endif

While this is all technically less work than making all the logic for a growable array from scratch ten times, It's certainly not pretty.

If you've ever had the life lesson of working with C++ templates, this is the sort of thing that language feature is intended to replace.

template <typename T>
struct GrowableArray {
   T* data;
   size_t allocated;
   size_t size;  
}

If you haven't, don't get too excited. There lie demons also.

Conclusion

And that is basically it.

To grow an array you allocate a new array and copy data into it. To be efficient you allocate more memory than you need each time you grow.

If you want to make that data structure for more than one specific data type you either need to rely on runtime indirection and pointers or you need to dive into the C preprocessor and make template headers.

If you are a student who has a question you can ask below. You can find complete examples of all these approaches in this GitHub repo.

Corrections welcome.

Corrections

realloc

Instead of calloc for everything it is more efficient to use realloc. When malloc and co. give you a chunk of memory that memory might secretly be larger than you requested. If it is, you can avoid having to do much of the work of the allocator.

Efficient Runtime Generic Growable Arrays

One thing that was pointed out to me is that using a void** for the runtime is a naive strategy. We can avoid the memory indirection implied by having an array of pointers by storing the byte size of each element in the struct.

struct GrowableArray {
    void* data;
    size_t allocated;
    size_t size;
    void (*cleanup)(void*);
    size_t element_size;
}

Then when we allocate data we get void* instead of a void** for our storage. Functions like growable_array_get will still have to return a void* as a result, but those can be cast dereferenced. What is important is that the data behind the void* will have the ideal memory layout.

calloc behavior

Small but important in some contexts point: calloc doesn't give you "the zero" for every type. It does fill the memory with 0 bytes, but I have been informed that for floats and similar all zero bytes might not be a zero value.


<- Index