Just An Application

November 16, 2014

Swift vs. The Compound File Binary File Format (aka OLE/COM): Part Three — Now Read Your Header

The 512 byte header of a compound file can be represented as a Swift struct like this

    struct FlatFileHeader
        let signature               : EightBytes
        let clsid                   : CLSID
        let minor                   : UInt16
        let major                   : UInt16
        let byteOrder               : UInt16
        let sectorShift             : UInt16
        let miniSectorShift         : UInt16
        let reserved                : SixBytes
        let nDirSectors             : UInt32
        let nFATSectors             : UInt32
        let firstDirSector          : UInt32
        let xactionSig              : UInt32
        let miniStreamCutoffSize    : UInt32
        let firstMiniFATSector      : UInt32
        let nMiniFATSectors         : UInt32
        let firstDIFATSector        : UInt32
        let nDIFATSectors           : UInt32
        let difat                   : DIFAT

It is effectively a straight transcription from the specification with four exceptions

In the specification the first field is defined as

    Header Signature (8 bytes): ... 

the second field is defined as

    Header CLSID (16 bytes): ... 

the eighth field is defined as

    Reserved (6 bytes): ... 

and the last field is defined as

    DIFAT (436 bytes): ... 

In all these cases the field could be represented as


but that fails to capture the exact size of each field, so we do this instead.

We represent the ‘Header Signature’ using the struct EightBytes which looks something like this

    struct EightBytes
        let b0 : UInt8
        let b1 : UInt8
        let b2 : UInt8
        let b3 : UInt8
        let b4 : UInt8
        let b5 : UInt8
        let b6 : UInt8
        let b7 : UInt8

We represent the ‘Header CLSID’ using the struct CLSID which looks something like this

    struct CLSID
        let first   : EightBytes
        let second  : EightBytes

We represent the ‘Reserved’ field using the struct SixBytes which looks something like this

    struct SixBytes
        let b0 : UInt8
        let b1 : UInt8
        let b2 : UInt8
        let b3 : UInt8
        let b4 : UInt8
        let b5 : UInt8

The DIFAT field is not really 436 bytes but 109 32-bit integers which we can represent using the struct DIFAT which looks something like this

    struct DIFAT
        let i0  : UInt32
        let i1  : UInt32

At the moment it only represents the first two values but it can be ‘extended’ if necessary.

The result of using this seemingly random combination of rather odd structures is that the struct FlatFileHeader is indeed ‘flat’ which is to say that are all its fields are value types. They are in fact all structs.

Bearing in mind that the compound file format is little endian and so is this computer, and if we assume the Swift compiler

  1. represents the values of the UInt<N> types by the exact number of bytes necessary when the value is contained in a struct

  2. represents the fields in exactly the same order that they were defined and wihout padding,

  3. that it does the same recursively with the nested struct values, and

  4. that it ensures that the memory allocated for the struct at runtime is at least 4 byte aligned

then, not at all accidentally, the representation of the struct in memory would be identical to the representation of the header in the compound file, and vice-versa.

It is the vice-versa case which is of interest since it would imply that if we had an NSData object containing at least the
first 512 bytes of a compound file then we could ‘read’ the header like this

    let flatHeader = UnsafePointer<FlatFileHeader>(data.bytes).memory

This is not necessarily the piece of insane optimism that it might at first appear.

Given the seamless interworking between Swift and Objective-C it would make a great deal of sense if at runtime a Swift struct meeting the right criteria was identical to the equivalent Objective-C struct.

Running this

    let data = NSData(contentsOfFile:fileName)
    if data == nil
    let nBytes = data!.length
    if nBytes < CFBFFormat.HEADER_SIZE
    let flatHeader = UnsafePointer<FlatFileHeader>(data!.bytes).memory
    for i in 0 ..< 8
        print("\(flatHeader.signature[i]) ")
    println("Byte order:\t\t\t\(flatHeader.byteOrder)")
    println("Sector shift:\t\t\(flatHeader.sectorShift)")
    println("MiniSector shift:\t\(flatHeader.miniSectorShift)")
    println("N dir sectors:\t\t\(flatHeader.nDirSectors)")
    println("N FAT sectors:\t\t\(flatHeader.nFATSectors)")

prints this

    Signature:          208 207 17 224 161 177 26 225
    Major:              3
    Minor:              62
    Byte order:         65534
    Sector shift:       9
    MiniSector shift:   6
    N dir sectors:      0
    N FAT sectors:      1

The specification gives the signature bytes as

    0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1

so we appear to have ‘read’ the header successfully.

Additional checks on the fields with predefined values.

Major version is 3 in which case the specification says the minor version should be 0x003E which it is.

Byte order should be 0xFFFE which it is.

The sector shift is correct, as is the minisector shift.

The number of directory sectors in a version 3 file is always 0 and it is

All done with nary a getUInt16 or a getUInt32 in sight.

Copyright (c) 2014 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create a free website or blog at WordPress.com.

%d bloggers like this: