You spin me right round: why unsigned integer overflow is still relevant

There are some vulnerabilities that most developers have heard about at this point. SQL injection is a good example – everyone knows little Bobby Tables. But what if I told you there was an under-the-radar category of vulnerabilities that have been with us ever since the inception of programming languages, causing critical safety and security problems time and time again, with no universal solution on the horizon? The vulnerability in question is integer overflow.

While not nearly as dominant as SQL injection, integer overflow still boasts a respectable number of new vulnerabilities discovered each year – and more importantly, the CVSS scores tend to be in the ‘high’ and ‘critical’ categories, indicating that the consequences of this vulnerability are quite dire indeed. Unlike SQL injection, this issue exists outside the security world as well. The integer overflow causing the 1996 disaster of the Ariane 5 rocket’s maiden flight is perhaps the worst example – it caused a loss of hundreds of millions of dollars and seriously harmed trust in the European Space Agency’s rockets for over a decade. In a less critical but certainly newsworthy context, the number of views of the ‘Gangnam Style’ video on YouTube has threatened to overflow the view counter in 2014, forcing YouTube’s developers to rewrite the backend software on the fly. There are many more examples too, of course.

The cause of all these problems is deceptively simple – but ultimately unsolved in most platforms, even today. In this article, we’ll dive into the whys and wherefores.

Nothing to see here – the standard says so!

Whenever we use a fixed-width integer type to store a numeric value, we will have to ask ourselves: what happens if we perform an arithmetic operation, and its result doesn’t fit into that type? There is no way around it – we are going to trigger what is known as the integer overflow problem and the result is going to be incorrect in some way. This applies to 8-bit signed types just the same as 64-bit unsigned types.

Let’s say we are dealing with a 16-bit unsigned integer and we add 1 (0x0001) to 65535 (0xFFFF). The result (65536 or 0x010000) doesn’t fit into a 16-bit integer anymore, so we truncate the result to 16 bits, which will give us 0 (0x0000). An appropriate flag (‘Carry’) will be set in the CPU.

Similarly, if we’re dealing with a 16-bit signed integer and we add 1 to 32767 (0x7FFF), the result will be -32768 (0x8000 in twos complement representation) and another flag (‘Overflow’) will be set in the CPU. But let’s set aside signed integer overflow for later – in this article we’ll be focusing on the first type of integer overflow (which is also called a wraparound).

In theory, our compiled code could do something with these flags and handle integer overflow as an error or an exception – but unfortunately, that’s not how it works in practice. C compilers in particular just completely ignore these flags (somewhat justifiably, since the flags are architecture-specific), the integer overflow happens silently, and our code continues with the bad data. And it’s not the fault of the compilers, either.

If we look at the most recent C17/C18 standard (though this particular part has basically not changed since ANSI C), section 6.2.5 says that “A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.” This means that having such an overflow issue in the code will just quietly truncate the result and give us a wrong value without even telling us about it!

A recent integer overflow example: the consequences of an evil tiled TIFF image

We can just take a recent critical vulnerability as an example: CVE-2022-3970, an unsigned integer overflow in LibTIFF (a library dealing with images that use the Tag Image File Format, especially popular in mapping and geographical use cases). Let’s look at the source code of tif_getimage.c, and specifically TIFFReadRGBATileExt().

int TIFFReadRGBATileExt(TIFF* tif, uint32_t col, uint32_t row, uint32_t * raster, int stop_on_error )
{
    char     emsg[1024] = "";
    TIFFRGBAImage img;
    int      ok;
    uint32_t tile_xsize, tile_ysize;
    uint32_t read_xsize, read_ysize;
    uint32_t i_row;
    ...

The vulnerability is at the end of the function, in the special case where the entire tile could not be placed in memory beforehand.

    ...
    /*
     * If our read was incomplete we will need to fix up the tile by
     * shifting the data around as if a full tile of data is being returned.
     *
     * This is all the more complicated because the image is organized in
     * bottom to top format.
     */

    if( read_xsize == tile_xsize && read_ysize == tile_ysize )
        return( ok );

    for( i_row = 0; i_row < read_ysize; i_row++ ) {
        memmove( raster + (tile_ysize - i_row - 1) * tile_xsize,
                 raster + (read_ysize - i_row - 1) * read_xsize,
                 read_xsize * sizeof(uint32_t) );
        _TIFFmemset( raster + (tile_ysize - i_row - 1) * tile_xsize+read_xsize,
                     0, sizeof(uint32_t) * (tile_xsize - read_xsize) );
    }

    for( i_row = read_ysize; i_row < tile_ysize; i_row++ ) {
        _TIFFmemset( raster + (tile_ysize - i_row - 1) * tile_xsize,
                     0, sizeof(uint32_t) * tile_xsize );
    }
    return (ok);
}

The culprits are the various memory manipulation operations that use dangerously calculated offsets to the raster buffer, for example the memmove() that’s bolded and underlined above. In the example image used in the disclosure of the vulnerability, tile_xsize was 926430463 and tile_ysize was 32. Thus, as long as tile_xsize was multiplied with a value of at least 5 in the loop (which would happen multiple times, since tile_ysize – i_row – 1 >= 5 until the 27th iteration), it would cause an integer overflow, wrap around zero and cause an erroneous memmove() operation on a huge number of bytes (note that the third parameter of memmove() did not overflow since 926430463 * sizeof(uint32_t) = 3705721852 < UINT_MAX). Of course, at this point we are still pretty far from achieving code execution, but it’s not surprising that NVD gave this vulnerability a 9.8 CVSS score – any vulnerability that can corrupt memory has the potential to turn into a RCE vulnerability, and in a widely-used image processing library, that could easily become a disaster! Thankfully in this case the issue was handled before bad actors could exploit it in the wild.

How to wrangle your (unsigned) integers

Integer overflow problems, especially the unsigned variant, are notoriously difficult to deal with in C despite decades of work on solutions. Of course it is always possible to write overflow-proof code by doing manual pre- or post-condition checks before each integer operation, but rewriting an entire codebase like that is probably more trouble than is worth.

integer overflow

Some compilers implement functions that perform overflow checking – for example, bool __builtin_add_overflow(type1 a, type2 b, type3 c) and similar functions in GCC internally perform the operation on infinite-precision integers and then verify if the result can be safely cast back to the original type. The situation is much better in C++ with overflow-proof classes such as SafeInt and Boost’s Safe Numerics, but those are using tools we don’t have in C: operator overloading, class templates and exceptions. However, for a platform-independent alternative to gcc’s builtins, the SafeInt C++ class library also contains safe_math.h for C programs, which implements many safe implementations of basic arithmetic, e.g. via int32_t safe_add_uint32_int64(uint32_t a, int64_t b).

There are also quick-and-dirty solutions – and that’s actually what the developers of LibTIFF applied in this example: just do the operations in a larger integer type (in this case, size_t) where they’re guaranteed to not overflow.

for( i_row = 0; i_row < read_ysize; i_row++ ) {
    memmove( raster + (size_t)(tile_ysize - i_row - 1) * tile_xsize,
             raster + (size_t)(read_ysize - i_row - 1) * read_xsize,
             read_xsize * sizeof(uint32_t) );
    _TIFFmemset( raster + (size_t)(tile_ysize - i_row - 1) * tile_xsize+read_xsize,
                 0, sizeof(uint32_t) * (tile_xsize - read_xsize) );
}

for( i_row = read_ysize; i_row < tile_ysize; i_row++ ) {
    _TIFFmemset( raster + (size_t)(tile_ysize - i_row - 1) * tile_xsize,
                 0, sizeof(uint32_t) * tile_xsize );
}

This can work in some cases, though not necessarily for every operation or combination of operations – exponentiation comes to mind – and is a bit more resource-hungry than using safe integer functions. Using arbitrary-precision arithmetic (‘big integers’) can also work, though it is even more expensive.

As for finding these overflows, a combination of fuzzing and dynamic testing is quite efficient – in particular, UBSan (or the Undefined Behavior Sanitizer) can find many different types of integer overflows – though detection of unsigned overflows may not be available on all compilers. This bug, like many other similar vulnerabilities, was found via fuzzing as part of Google’s OSS-Fuzz project in combination with the aforementioned UBSan checker.

Following secure coding best practices can prevent integer overflow vulnerabilities. Learn more about input validation, the subtypes of integer overflow and truncation vulnerabilities, and most importantly, the best practices to deal with them – to make sure the next world-shaking vulnerability doesn’t pop up in your code!