It generates much more code, and much slower code (and more fragile code), than just using a fixed key size would have done ~ Linus Torvalds

VLA is an acronym of variable-length array, which is an array (actual array, not just block of memory which can act like one) that have size (at least one dimension) determined during runtime (instead of at compile time).

VLAs were introduced with the revision C99 of the C standard. At first glance they seem convenient and efficient, but it's just an illusion. In reality they are just sources of constant issues. Truly a stain on otherwise really good revision.

Allocation on stack

VLAs are usually allocated on stack and this is the source of the most of the problems. Let's consider a painfully simple, very favourable to VLA, example:

#include <stdio.h>

int main(void) {
    int n;
    scanf("%d", &n);
    long double arr[n];
    printf("%Lf", arr[0]);
    return 0;
}

As we can see, it takes a number from user then makes array of that size. Compile and try it. Check how big values you can input before getting segfault caused by stack overflow. In my case, it was around half a million. With primitive type! Imagine what would be the limit for structure! Or what if it wasn't just main()? Maybe a recursive function? The limit shrinks tremendously.

And you don't have any (portable, standard) way to react after a stack overflow - the program already crashed, you lost control. So you either need to make elaborate checks before declaring an array or betting that user won't input too large values (I think we can already guess the outcome of such gamble).

So the programmer must ensure that VLA size doesn't exceed some safe maximum, but in reality, if you know safe maximum, there is rarely any reason for not using it always.

Worst of it is…

… that segfault is actually one of the best outcomes of improperly handled VLA. The worst case is an exploitable vulnerability, where attacker may choose a value that causes an array to overlap with other allocations, giving them control over those values as well. A security nightmare.

So how to fix this example?

What if I need to let user define size and creating ridiculously large fixed array would be too wasteful? It's simple - use malloc()!

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    int n;
    scanf("%d", &n);
    long double* arr = malloc(n * sizeof(*arr));
    printf("%Lf", arr[0]);
    free(arr);
    return 0;
}

In this case I was able to input over 1.3 billion on my machine before segfault. Almost 2500 times larger size! But I still got the segfault, right? Well, the difference is in possibility of checking the value returned by malloc() and thus being able to, for example, inform the user about the error:

    long double* arr = malloc(n * sizeof(*arr));
    if (arr == NULL) {
        perror("malloc()"); // output: "malloc(): Cannot allocate memory"
    }

Creation by accident

Unlike most other dangerous C functionality, VLA doesn't have the barrier of being not known. Many newbies learn to use them via trial and error, but don't learn about the pitfalls. Sometimes even an experiences programmer will make an mistake and create an VLA when not intended. The following will silently create a VLA when it's clearly not necessary:

const int n = 10;
int A[n];

Thankfully, any half-decent compiler would notice and optimize VLA away, but… what if it doesn't notice? Or what if for some reason (safety?) the optimizations were not turned on? But it surely isn't so much worse, right? Well…

Slower than fixed size

Without compiler optimizations a function with VLA from previous example will result in 7 times more Assembly instructions than its fixed size counterpart before moving past the array definition (look at the body before jmp .L5). But it's without optimizations - with them the produced Assembly is exactly the same.

So an example where VLA isn't by mistake:

#include <stdio.h>
void bar(int*, int);

#if 1 // 1 for VLA, 0 for VLA-free

void foo(int n) {
    int A[n];
    for (int i = n; i--;) {
        scanf("%d", &A[i]);
    }
    bar(A, n);
}

#else

void foo(int n) {
    int A[1000];  // Let's make it bigger than 10! (or there won't be what to examine)
    for (int i = n; i--;) {
        scanf("%d", &A[i]);
    }
    bar(A, n);
}

#endif

int main(void) {
    foo(10);
    return 0;
}

void bar(int* B, int n) {
    for (int i = n; i--;) {
        printf("%d %d", i, B[i]);
    }
}

For our educational purposes in this example, -O1 level of optimisation will work best (as Assembly will be more clear and -O2 won't help VLA's case here really much).

When we compile VLA version, before instructions corresponding to for loop, we get:

push    rbp
mov     rbp, rsp
push    r14
push    r13
push    r12
push    rbx
mov     r13d, edi
movsx   r12, edi       ; here VLA "starts"...
sal     r12, 2         ;
lea     rax, [r12+15]  ;
and     rax, -16       ;
sub     rsp, rax       ;
mov     r14, rsp       ; ... and there "ends"

VLA-free version on the other hand generates:

push    r12
push    rbp
push    rbx
sub     rsp, 4000      ; this is caused by array definition
mov     r12d, edi

So not only fixed array spawns less code, but also way simpler code. Why, VLA even causes more overhead at the beginning of the function. It's not so much more in the grand scheme of things, but it still isn't just a pointer bump.

But are those differences significant enough to care? Yes, they are.

No initialization

To add more to the issue with inadvertent VLA, the following isn't allowed:

int n = 10;
int A[n] = { 0 };

Even with optimizations, initialisation isn't allowed for VLAs. So despite wanting fixed size array and compiler being technically able to provide one, it's won't work.

Mess for compiler writers

Few months ago I saved a comment on Reddit listing problems encountered with VLA from compiler writer perspective. I will cite it:

  • A VLA applies to a type, not an actual array. So you can create a typedef of a VLA type, which "freezes" the value of the expression used, even if elements of that expression change at the time the VLA type is applied
  • VLAs can occur inside blocks, and inside loops. This means allocating and deallocating variable-sized data on the stack, and either screwing up all the offsets, or needing to do things indirectly via pointers.
  • You can use goto into and out of blocks with active VLAs, with some things restricted and some not, but the compiler needs to keep track of the mess.
  • VLAs can be used with multi-dimensional arrays.
  • VLAs can be used as pointer targets (so no allocation is done, but it still needs to keep track of the variable size).
  • Some compilers allow VLAs inside structure definitions (I really have no idea how that works, or at what point the VLA size is frozen, so that all instances have the same VLA(s) sizes.)
  • A function can have dozens of VLAs active at any one time, with some being created or destroyed at different times, or conditionally, or in loops.
  • sizeof needs to be specially implemented for VLAs, and all the necessary info (for actual VLAs, VLA-types, and hybrid VLA/fixed-size types and arrays and pointed-to VLAs).
  • 'VLA' is also the term used for multi-dimensional array parameters, where the dimensions are passed by other parameters.
  • On Windows, with some compilers (GCC at least), declaring local arrays which make the stack frame size over 4 KiB, mean calling a special allocator (__chkstk()), as the stack can only grow a page at a time. When a VLA is declared, since the compiler doesn't know the size, it needs to call __chkstk for every such function, even if the size turns out to be small.

And believe me, if you take a stroll around some C forums you will see even more different complaints.

Reduced portability

Due to all previously presented problems, some compiler providers decided to not fully support C99. The primary example is Microsoft with its MSVC. The C Standard's Committee also noticed the problem and with C11 revision VLAs were made optional (many would prefer deprecated).

That means code using a VLA won't necessarily be compiled by a C11 compiler, so you need to check whether it is supported with __STDC_NO_VLA__ macro and make version without VLA as fallback. Wait… if you need to implement VLA-free version either way then what's the point of doubling the code and creating VLA in the first place?!

(nitpick) Breaking conventions

This one is more of a nitpick, but still another reason to dislike VLA. There is a widely used convention of first passing object then its parameters, what in terms of arrays means:

void foo(int** arr, int n, int m) { /* arr[i][j] = ... */ }

C99 specified that array sizes need to be parsed immediately when encountered within a function definition's parameter list, what means that when using VLA you cannot do an equivalent of the above:

void foo(int arr[n][m], int n, int m) { /* arr[i][j] = ... */ } // INVALID!

You either need to:

  • break up with the convention:
    void foo(int n, int m, int arr[n][m]) { /* arr[i][j] = ... */ }
    
  • or make use of the obsolescent syntax:
    void foo(int[*][*], int, int);
    void foo(arr, n, n)
        int n;
        int m;
        int arr[n][m]
    {
        // arr[i][j] = ...
    }
    

… kinda useful in one case

There is one use case where VLA could come in handy: dynamically allocating multi-dimensional arrays where the inner dimensions are not known until runtime. It isn't even so unsafe as there's no arbitrary stack allocation.

int (* A)[m] = malloc(n * sizeof(*A)); // `n` and `m` are variables with dimensions
if (A) {
    // A[i][j] = ...;
    free(A);
}

The VLA-free alternatives are either:

  • piecemeal allocation with malloc()
    int** A = malloc(n * sizeof(*A));
    if (A) {
        for (int i = 0; i < m; ++i) {
            A[i] = malloc(m * sizeof(*A[i]));
        }
        // A[i][j] = ...
        for (int i = 0; i < m; ++i) {
            free(A[i]);
        }
        free(A);
    }
    
  • 1D array with offsets
    int* A = malloc(n * m * sizeof(*A));
    if (A) {
        // A[i*n + j] = ...
        free(A);
    }
    
  • big fixed array
    int A[SAFE_SIZE][SAFE_SIZE]; // SAFE_SIZE must be safe for SAFE_SIZE*SAFE_SIZE
    // A[i][j] = ...;
    

Conclusion

In short, avoid VLA. It poses dangers without giving anything really useful in return (beside in one situation). If you really need to use one, remember about its limitations.

And for the very end, an example of vla lacking all those problems: Chocolate vla