Just An Application

August 25, 2014

Anatomy Of A PDF: Part Seven — How To Overflow An Unsigned Integer Using Nothing But Bytes !

Filed under: BMP, BMP Run Length Encoding, Image Formats, Security — Tags: , , , — Simon Lewis @ 8:33 am

The theory of overflowing unsigned integers is, fortunately, well understood.

You fill them up with ‘F’s, you can use ‘f’s instead if you prefer, and then you add 1, like so.

    uint32_t i = 0xFFFFFFFF;

    i += 1;

and i is now zero.

Nothing to it.

Alternative combinations of ‘F’s not to mention other hex digits and increments are possible but its best to start off with the base case and make sure you’ve got the hang of it before moving on to the more advanced stuff.

One obvious application of this is when you have an excess of large unsigned integers and a shortage of zeroes. You can use this technique to turn the former into the latter.

Another application is this

    0000000 42 4d 00 00 00 00 00 00 00 00 00 00 00 00 40 00
    0000010 00 00 2e 01 00 00 01 00 00 00 01 00 08 00 01 00
    0000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00
    0000030 00 00 00 00 00 00 52 47 42 41 52 47 42 41 00 02
    0000040 ff 00 00 02 ff 00 00 02 ff 00 00 02 ff 00 00 02
    4040440 f8 00 00 08 01 00 00 00 00 00 27 05 00 02 00 ff
    4040450 00 02 00 ff 00 02 00 ff 00 02 00 ff 00 02 00 ff
    4040470 00 02 00 ff 00 0a 58 58 58 58 58 58 58 58 58 58

which is the really really big BMP that came free with the PDF.

The result of Base64 decoding the image data in the XML clocks in at 67,372,160 bytes, which is surprisingly large for an image which is only 302×1.

The reason that you can fit 67MB of data on nine lines of hexdump output is that the data is intensively repetitive.

The vast majority of the file comprises the four bytes

    00 02 ff 00

The first occurence is at 0x00003E and the last occurence ends at 0x404043E which gives a grand total of 0x4040400 bytes or 0x1010100 repetitions.

If you multiply 0xFF by 0x1010100 you get 0xFFFFFF00 which is a mere 0xFF+1 away from being 0 in certain circumstances.

This is interesting because that multiplication is equivalent to the effect of the four bytes

    00 02 ff 00

being repeated 16843008 times in the BMP image data.

As we know the header specifies that the image is run length encoded which means that the image data is a series of commands which are evaluated at runtime to produce the bytes that comprise the image.

The image is built up left to right, bottom to top as the commands are evaluated.

If the result of evaluating a command is a sequence of bytes, those bytes are appended at the offset within the current line specified by the current x position.

Both the current x position and the current y position, which specifies the current line, can be changed using a delta command.

The four bytes

    00 02 ff 00

are an example of a delta command the effect of which is to add 0xFF to the current x position.

If that command is successfully evaluated 16843008 times then you know

  1. that whatever is doing the evaluating is not doing any kind of bounds checking in this particular case, and

  2. that the current x position is now 0xFFFFFF00

The command at 0x404043E which immediately follows the repetitions is another delta command

    00 02 f8 00

except that this time it adds 0xF8 to the current x position which brings it up 0xFFFFFFf8 as well as creating the expectation that the next command is going to feature the number 8.

The next command is at 0x4040442 and it is this

    00 08 01 00 00 00 00 00 27 05

and lo and behold there is the number 8.

The effect of the command is to add the last eight bytes of the command as image data at the current x position

Clearly at this point it would be a very irresponsible of whatever it is that is evaluating the commands not to perform a bounds check, which means that we are going to discover exactly how the current x position is being represented.

The width and height fields of this particular BMP are signed 32-bit integers so it would not be unreasonable for the current x position to be represented using an unsigned 32 bit integer.

If this command succeeds than either

  • there is no bounds checking


  • the current x position is represented by the command evaluator as an unsigned 32 bit integer

Whichever it is, those eight bytes are going to end up somewhere where they are going to make no useful contribution to the image at all.

After all the excitement of adding some actual bytes, albeit in the wrong place, it all seems to go horribly wrong.

Starting at 0x404044C another delta command

    00 02 00 ff

which adds 0xFF to the current y position, is repeated ten times.

The final command at 0x4040474

    00 0a 58 58 58 58 58 58 58 58 58 58

then adds ten bytes at the current x position, except of course that it doesn’t.

The current y position is out of bounds, the image is supposed to be 302×1 after all, and nothing like large enough to benefit from any kind of unsigned integer overflow and that is intentional.

Causing the image load to fail results in the memory allocated for the image being freed.

If the eight bytes that actually got written have ended up in the right wrong place then presumably their effect is to cause the heap management code to free the wrong object at this point.

Copyright (c) 2014 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.


Blog at WordPress.com.

%d bloggers like this: