Fork me on GitHub

Grammar for dissector outlook v0.2

In a previous post, we defined security rules on SSL protocol in order to block heartbleed attack. This post will present a v0.2 feature: the grammar used to specify the SSL protocol. Our grammar can parse binary-based as well as text-based protocols.

This post will focus on the dissection of the ClientHello handshake.

ClientHello Handshake Structure

The client Hello Handshake is defined in the RFC 6101 as:

struct {
    ProtocolVersion client_version; 
        // handshake type (1 byte)
        // length (3 byte)
        // version (2 bytes)
    Random random;
        // unixtime (4 bytes)
        // random data (28 bytes)
    SessionID session_id;
        // length (1 byte)
        // sessionid (length bytes size, may be null)
    CipherSuite cipher_suites;
       // a length (2 bytes)
       // an list of ciphersuites id 2bytes each, (length bytes)
    CompressionMethod compression_methods;  
        // a length (1 byte)
        // a list of compression method id (length bytes size)
    select (extensions_present) {   // optional field   
    };
} ClientHello;

Creating dissector

We start by creating a new grammar ssl:

ssl_dissector.grammar = haka.grammar.new('ssl', function ()
    (...)
end)

Introducing record, field and number elements

A packet is a record of fields. So grammar have record and field keywords. Fields have a name, and a type. One of those type is number, which takes the number of bits in argument and return its value. We can write the three first lines of dissection:

local handshake = record{
    field('type', number(8)),
	field('length', number(24)),
	...
}

We can now access type and length fields.

Branch element

At this point, SSL packet dissection become conditional. If type == 1, then it's a ClientHello Handshake. If type == 2 then it's a ServerHello Handshake, and so on (all types are defined in RFC). So we have follow dissection according to previously parsed fields (type). To that end, we use the branch element:

handshake = record{
    field('type', number(8)),
    field('length', number(24)),
    branch(
        {
         [1] = client_hello_handshake,
         [2] = server_hello_handshake,
         [11] = certificate_handshake,
          (...)
        },
        function(self) return self.type end
	)
}

The function in the last line returns the type which allows to select the branch to follow.

Dissection of client_hello_handshake

When we parse the ClientHello handshake, the handshake type and length have already been read. So client hello starts at version field:

client_hello_handshake = record{
    field('version', number(16)),
    field('random', hello_random),
    field('sessionid', hello_sessionid),
    field('ciphersuites', hello_ciphersuites),
    field('compressionmethods', hello_compressionmethods)
}

The client_hello_handshake is made of 5 mandatory fields, and a sixth optional field (extensions). The first field is named version and have a fixed size of 16 bits, hence the number(16). All other fields are defined next. For the clarity of the demonstration we won't show how to parse the optional extensions list.

Building complex structure

The random field is a collection of two static fields. We defined it as another record:

hello_random = record{
    field('unixtime', number(32)),
    field('random', bytes():count(28))
}

Those fields will be available as a number and data, data.random.unixtime and data.random.random respectively

Variable length fields

The field sessionid have variable length:

hello_sessionid = record{
    field('length', number(8)),
	field('sessionid', bytes()
	    :count(function (self) return self.length end)
    )
}

We use a function to get this value. The function extract the length from the eponym field defined previously. Note that this function can be used to do any kind of calculus or data manipulation.

Arrays fields

The ciphersuite is an array of values. The array is defined as:

hello_ciphersuites = record{
    field('length', number(16)),
	field('ciphersuites', array(ciphersuite)
	    :count(function (self) return self.length/2 end)
    )
}

And ciphersuite is a record of all ciphersuite available.

The last field, compression method, is defined exactly like the ciphersuite. It's made of a length, then an array of compression methods. It's not shown here for concision.

Conclusion

This post is a short introduction of the dissection of a binary protocol. Note that dissection is almost as easy as a copy/paste of the initial structure.

We were able to write most of our dissectors with this grammar. If you want to see a binary dissector, see modules/protocol/ipv4/udp.lua in the develop branch. For a text-based protocol, you can see modules/protocol/http/http.lua. Those developments have shown us that writing dissector with this grammar is easier than in C.

Full explanation of Haka grammar will be provided in the release 0.2 with tutorial. Stay tuned: next blog post will explain how state machine works.