Just An Application

August 28, 2014

Anatomy Of A PDF Continued: #4 — Part Two: What Is It With This Obfuscation Lark ?

Javascript Obfuscation Considered …

What IS the point of doing this

    var sekritVar0001 = [0x30,0x74,0x67,0x72,0x6E,0x63,0x65,0x67, 1, 10, 40];

and then this somewhere else ?

    for(var q = 0; q < sekritVar0001.length-3; q++)
    sekritVar0002 += String.fromCharCode(sekritVar0001[q]-2);


Anybody who has managed to lever open a mal-formed PDF and get to the point where they can actually see it is hardly likely to give up at this juncture just because they are going to have to iterate over an array subtract 2 from every element and turn into a character now are they ?

It says


for pity’s sake.

And then this.

    var p1 = "(\/[^\\/";
    var p2 = "(\/[\\/";
    var sekritVar0003 = "x" + sekritVar0002 + p1 + "\\d]\/g,'')";
    var sekritVar0004 = "z" + sekritVar0002 + p2 + "]\/g,',')";

Oh no the regular expressions are in two halves ! Oh woe is me !

The first one says strip out everything that isn’t a digit or the character ‘/’.

The second one says turn all the ‘/’s into ‘,’s.

If you apply the second to the result of the first you end up with a lot of comma separated integers, and here’s the start of some non-base64 encoded image data.

    fpo10/t10/hmA32/Ac32/nCK32/XX32/yCO32/R32 ...

What a coincidence.

As for this

    function sekritFun0001(x)
        var s = [];
        var z = sekritFun0002(sekritVar0003);
        z = sekritFun0002(sekritVar0004);
        var ar = sekritFun0002("[" + z + "]");
        for (var i = 0; i < ar.length; i ++)
            var j = ar[i];
            if ((j >= 33) && (j <= 126))
                s[i] = String.fromCharCode(33 + ((j + 14) % 94));
                s[i] = String.fromCharCode(j);
        return s.join('');

A variable z which we have worked out is a string that looks like a comma separated list of integers is topped and tailed with what looks suspiciously like the delimiters of a Javascript array literal and is then passed to sekritFun0002 and the result is assigned to the variable ar.

Then there is a for loop which is under the mistaken impression that ar references an array.

Now let me guess. The function sekritFun0002 is really the well known turn strings into arrays as if by magic function for which Javascript fortunately defines a four letter abbreviation.

Then there is the body of the loop.

To cut a long story short here is some Java code which does the same thing as the Javascript function written in the increasingly popular no-regexp idiom

    private static String decode(byte[] theBytes)
        StringBuilder builder = new StringBuilder();
        int           nBytes  = theBytes.length;
        int           v       = 0;

        for (int i = 0; i < nBytes; i++)
            byte b = theBytes[i];
            if (b == '/')
                // end of number
                int c = 0;
                if ((v >= 33) && (v <= 126))
                    c = 33 + ((v + 14) % 94);
                    c = v;
                v = 0;
            if (b >= '0' && (b <= '9'))
                v *= 10;
                v += b - '0';
        return builder.toString();

If you run it on the two non-images needless to say you get Javascript.

… Pointless ?

If the point of the obfuscation is that re-obfuscation will result in the hash of the PDF changing then, as I have already observed, there are much much easier ways to achieve the same effect.

If the point of the obfuscation is to conceal the existence of certain strings that could be used to identify the PDF as malicious it is only necessary if the assumption is that something has

  1. parsed the PDF file

  2. extracted Object 1 into a usable state which so far has always required inflating it twice

  3. parsed the XML of the XFA form and identified the different elements

After step 1 the structure of the PDF is apparent and can be considered characteristic of this particular PDF.

After step 2 the ratio of the original to the final size of Object 1 is apparent and can also considered to be characteristic of this particular PDF.

After step 3 another characteristic of this particular PDF is apparent without even looking at the Javascript elements.

To misquote Mae West is that an image in your form or are you just pleased to see me ?

The content of the form is dominated by one thing, an image. To all intents and purposes that’s all it is.

If whatever it is has got this far it can be about 99% certain that it is looking at an instance of this particular malicious PDF.

It can increase the certainty still further by inspecting the image without even decoding it.

In short, unless you can conceal the elephant in the form, there is no need to obfuscate the Javascript other than for amusement.

Postscript (Not The Language)

Even if it was a valid PDF I don’t think this one would do what it was intended to do either.

Copyright (c) 2014 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create a free website or blog at WordPress.com.

%d bloggers like this: