When building the initial version of Eider, I took an approach similar to the NASDAQ's ITCH and OUCH protocols since I had no need for variable length data or null data. This article is a look into some problems and approaches to dealing with variable length data and null data - both from existing in memory solutions such as Simple Binary Encoding (SBE) and Cap'n Proto, and those used in database engines.
Sometimes in distributed systems we need to move complex data around - data which is made with variable length fields, repeating groups, or even repeating groups of repeating groups containing variable length data. Two common ways to deal with these are to use pointers or encode sequentially.
When working with pointers, the buffer data will contain specific offsets of data which are known (by fixed offset, or some other indicator) to be pointers to actual data. This offers some flexibility over sequentially written data, but brings its own complexities.
Below is a logical representation of an encoding with pointers. Offsets containing letters are pointers to data offsets, which are numbered.
If the code needed to read the first value of the third repeating group (offset 8 in the diagram), then decoder has to:
- first read the value in offset A to get the starting offset for the repeating group.
- then move to the third offset (offset D in the diagram), and then has to read the pointer to get the offset for the first repeating group.
- it then jumps to offset G, which then points to the start of the repeating group.
- it can then finally read the value in offset 8, which is the first value of the third repeating group.
Things go horribly wrong with the encoder though if offset G's pointer is set (via bug, network issue or malicious actor) to point to offset D again. If that were to happen, then the codec would have to read offset B, which points to G again - and the codec would be stuck in an infinite loop.
Decoders such as Cap'n Proto are required to deal with this - and typically solve it by ensuring stack depth doesn't exceed a certain limit and that pointers can't go backwards. Sequential encoders such as SBE do this without pointers, and are not subject to this problem. Where SBE is problematic though is with updates to variable field length fields.
Data corruption becomes a risk when not using pointers and allowing updates to variable length fields. For example, here's a simple example of how a variable length Field 1 can silently corrupt Field 2 by being updated with extra data:
SBE suffers from this problem. One option to solve this is to create a copy of the buffer, shifting the data around to make space. This is (relatively) computationally expensive - and can create garbage, and something that should be avoided in high performance systems. Pointers can solve this problem, although at the expense of dead space in the buffer.
When using pointers, and allowing updates to variable length data, there is a risk of dead space. In the example below, letters are used for offsets containing pointers, and the data is held in numeric offsets. Initially, Field 1 is written, with the offset in block A pointing to the variable length data in offsets 1, 2 and 3. Field 2 is then written, taking the space in the buffer immediately after Field 1. Field 2's pointer is in offset B, and data in offsets 5 and 6. When Field 1 is updated with additional content, the codec must find space. Since there isn't sufficient space in the existing buffer allocated, it has to consume new space after Field 2. The pointer in offset A is updated to point to the data in offsets 7 to 12. Offsets 1, 2 and 3 no longer have anything pointing to them, and are now dead space in the buffer.
Solving the dead space in the buffer requires a defragmentation process, which can be too expensive in high performance systems but is common in database systems.
When sending data on the network, there is no concept of something being 'null'. For example, if we were to decide to encode a Java nullable Long type with a null value as a
0x00000000, then how would we encode the value
There are four approaches I've seen to this:
- Have a dedicated bitset with a value for each field. If the field is null, set it to 1; if the field is not null, set it to 0. This then needs to be decoded first, and becomes a piece of state within the flyweight (which implies it is no longer a flyweight). This approach is seen in some database engines.
- Use a generated or user defined magic value to represent null. This approach is seen in Simple Binary Encoding.
- When using pointers, set the pointer to
0if the field is null. This approach is seen in Cap'n Proto.
- When using pointers, have the pointer first point to a value which defines if the field is null. This typically requires a fixed length field - for example a long could be encoded with a single byte to hold the null/not null flag, followed by 8 bytes for the long. If the field is null, the decoder does not read the data and simply returns null.
- Published 18 April 2021