Dumb and obvious thoughts about parsing binary protocols

6 thoughts
last posted May 14, 2014, 6:42 a.m.

2 earlier thoughts

0

Wrong assumption number one

Validating the start of data, and then reading forward for the fixed packet/data length

Why this is wrong

Corruption comes in all flavours, including too-short messages in the datastream. Validate with a window of n>=2 that reading your normal message length doesn't eat into the start of a next message.

Example

Assuming our first 4 bytes identify a message, and writing some python-esque pseudocode (where function or variable names are not written, consider them already implemented for the example case)

def validate(data):
    messageOffsets = findMessages(data)
    if messageOffsets[1] < messageLength:
        return badMessage
    else:
        """ validation here """
        if data[0:4] == '\x01\x02\x03\x04':
            return data[:messageLength]

3 later thoughts