Just An Application

August 28, 2014

Anatomy Of A PDF Continued: #4 — Part Two: What Is It With This Obfuscation Lark ?

Javascript Obfuscation Considered …

What IS the point of doing this

    ...
    
    var sekritVar0001 = [0x30,0x74,0x67,0x72,0x6E,0x63,0x65,0x67, 1, 10, 40];
    
    ...

and then this somewhere else ?

    ...
    
    for(var q = 0; q < sekritVar0001.length-3; q++)
    sekritVar0002 += String.fromCharCode(sekritVar0001[q]-2);

    ...

Anybody who has managed to lever open a mal-formed PDF and get to the point where they can actually see it is hardly likely to give up at this juncture just because they are going to have to iterate over an array subtract 2 from every element and turn into a character now are they ?

It says

   .replace

for pity’s sake.

And then this.

   ...
    
    var p1 = "(\/[^\\/";
    
    ...
    
    var p2 = "(\/[\\/";
    var sekritVar0003 = "x" + sekritVar0002 + p1 + "\\d]\/g,'')";
    var sekritVar0004 = "z" + sekritVar0002 + p2 + "]\/g,',')";
    
    ...

Oh no the regular expressions are in two halves ! Oh woe is me !

The first one says strip out everything that isn’t a digit or the character ‘/’.

The second one says turn all the ‘/’s into ‘,’s.

If you apply the second to the result of the first you end up with a lot of comma separated integers, and here’s the start of some non-base64 encoded image data.

    fpo10/t10/hmA32/Ac32/nCK32/XX32/yCO32/R32 ...

What a coincidence.

As for this

    ...
    
    function sekritFun0001(x)
    {
        var s = [];
    
        var z = sekritFun0002(sekritVar0003);
        z = sekritFun0002(sekritVar0004);
    
        var ar = sekritFun0002("[" + z + "]");
    
        for (var i = 0; i < ar.length; i ++)
        {
            var j = ar[i];
            if ((j >= 33) && (j <= 126))
            {
                s[i] = String.fromCharCode(33 + ((j + 14) % 94));
            }
            else
            {
                s[i] = String.fromCharCode(j);
            }
        }
        return s.join('');
    }
    
    ...

A variable z which we have worked out is a string that looks like a comma separated list of integers is topped and tailed with what looks suspiciously like the delimiters of a Javascript array literal and is then passed to sekritFun0002 and the result is assigned to the variable ar.

Then there is a for loop which is under the mistaken impression that ar references an array.

Now let me guess. The function sekritFun0002 is really the well known turn strings into arrays as if by magic function for which Javascript fortunately defines a four letter abbreviation.

Then there is the body of the loop.

To cut a long story short here is some Java code which does the same thing as the Javascript function written in the increasingly popular no-regexp idiom

    private static String decode(byte[] theBytes)
    {
        StringBuilder builder = new StringBuilder();
        int           nBytes  = theBytes.length;
        int           v       = 0;

        for (int i = 0; i < nBytes; i++)
        {
            byte b = theBytes[i];
    
            if (b == '/')
            {
                // end of number
    
                int c = 0;
    
                if ((v >= 33) && (v <= 126))
                {
                    c = 33 + ((v + 14) % 94);
                }
                else
                {
                    c = v;
                }
                builder.append((char)c);
                v = 0;
            }
            else
            if (b >= '0' && (b <= '9'))
            {
                v *= 10;
                v += b - '0';
            }
        }
        return builder.toString();
    }

If you run it on the two non-images needless to say you get Javascript.

… Pointless ?

If the point of the obfuscation is that re-obfuscation will result in the hash of the PDF changing then, as I have already observed, there are much much easier ways to achieve the same effect.

If the point of the obfuscation is to conceal the existence of certain strings that could be used to identify the PDF as malicious it is only necessary if the assumption is that something has

  1. parsed the PDF file

  2. extracted Object 1 into a usable state which so far has always required inflating it twice

  3. parsed the XML of the XFA form and identified the different elements

After step 1 the structure of the PDF is apparent and can be considered characteristic of this particular PDF.

After step 2 the ratio of the original to the final size of Object 1 is apparent and can also considered to be characteristic of this particular PDF.

After step 3 another characteristic of this particular PDF is apparent without even looking at the Javascript elements.

To misquote Mae West is that an image in your form or are you just pleased to see me ?

The content of the form is dominated by one thing, an image. To all intents and purposes that’s all it is.

If whatever it is has got this far it can be about 99% certain that it is looking at an instance of this particular malicious PDF.

It can increase the certainty still further by inspecting the image without even decoding it.

In short, unless you can conceal the elephant in the form, there is no need to obfuscate the Javascript other than for amusement.

Postscript (Not The Language)

Even if it was a valid PDF I don’t think this one would do what it was intended to do either.


Copyright (c) 2014 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Anatomy Of A PDF Continued: #4 — Part One: Now What ?

My collection of a single PDF continues to grow apace whether I want it to or not.

I got three ‘invoices’ in one go the other day but they were ZIPs and ZIPs usually means .exes and so it proved.

Then today a PDF arrived which not only supposedly originated in a completely different continent to the other three, but is four times as big as the others. That has got to be good hasn’t it ?

It turned out to be a bit of disappointment taken as PDF because technically it isn’t one. I know I got it for nothing and everything but honestly it comes to something when people can’t even be bothered to get the format right.

Still whats the point of writing a whacking great chunk of Objective-C to read well-formed PDFs if you can’t hack lumps out of it until it can read things that are not well-formed PDFs ? After some judicious hard-wiring of this and that I managed to extract Object 1 once again and inflate it twice as per usual.

And ?

And the XML is exactly the same as all the others ?

Yes and no. The overall structure is pretty much the same but the data for the first two images isn’t.

It does not look like Base64 encoded data and Base64 decoding it definitely does not produce Javascript.

A quick look for the Base64 alphabet construction kit reveals that it has gone walk about, but both images are referenced in the Javascript so the supposition has got to be that it is still really Javascript but its not Base64 encoded. Its a bit of a disappointment but you have got to move with the times I suppose. Base64 encoding was so last week.

Time to find out what this weeks fashionable Javascript encoding technology is i suppose.


Copyright (c) 2014 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

August 26, 2014

Anatomy Of A PDF: Afterword

Filed under: CVE, CVE-2013-2729, PDF, PDF Vulnerability, Security — Tags: , , , — Simon Lewis @ 7:14 am

Since I started writing these posts anonymous benefactors have very kindly presented me with two further versions of the original PDF to add to my collection.

I say versions because although they both possess exactly the same structure as the original, the size of Object 1 is slightly different in each one and the binary sludge is different.

This of course means that the hash of the file will be different in each case, which in turn means it is very likely that any hash based AV scanner will miss these slightly different versions unless they are kept updated with the hashes of these new versions as they appear.

Looking at the actual XML it is apparent that the obfuscation of the Javascript has resulted in different variable names in each version but that there is no difference between what is obfuscated and what is not in any of them.

One additional thing all three versions have in common is that I suspect they won’t actually work.

There is no question what they are trying to do and how they are trying to do it, and at least one version has been seen in the wild by someone else that works, but it is a distinct possibility that the versions I have do not.

I have no way of proving this one way or another as I do not have access to the appropriate environment.

If I am wrong and they will in fact do what they are intended to do I would obviously be interested in knowing why I am wrong. It is all grist to the mill.

If I am right then it is all to the good, at least these particular versions cannot cause any damage, so for obvious reasons I am not going to say why they will not work as intended other than that it is a very simple mistake.


Copyright (c) 2014 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Create a free website or blog at WordPress.com.