Protocol Generator

Protocol Generator & Serializer

Abstract

in this document, I will descript conceptual protocol implementation which is used to connecting server and client. By defining procotol prototype as C++ grammar style, anyone who has experienced with C, will easily define protocol and readability is supported.

Lex/Yacc is used for tokenizing text and generating related loader.

Once the definition string is parsed, it also generates helper function for serializing structural memory contents to binary stream and vice versa.

By adding code format factory, any type of language serializer could be generated.

Supported Data Definition Type

Definition name Data Size
int8 signed 8 bits
uint8 unsigned 8 bits
int16 signed 16 bits
uint16 unsigned 16 bits
int32 signed 32 bits
uint32 unsigned 32 bits
int64 signed 64 bits
uint64 unsigned 64 bits
float 32 bits float
bool Boolean
double signed 64 bits
string string
User defined custom type sizeof(user type)

Overall Serializer Design

Basically, serializer is operated like below.

Let’s suppose the structure is defined as above. Each variable could be stored in memory sequentially. The numeric type is just fit in size but string type is some kind of dynamic size and so it requires length before actual string bytes.

Overall Protocol Format

Like many other open-protocols, it consists of several columns.

Total message length: it denotes overall message length.
Message ID: it is used for distinguishing each protocol.
Bytes Stream: serialized memory stream.

You can expand memory size for length and message id.

Once data stream is received, total message length is checked. If overall bytes are collected, then message id is used for peeking proper dispatcher. If not, it is pended for next stream.

Functional expansion

The introduced concept shows just fixed structure variables(every variables are included and serialized) but it could support dynamic inclusion. Why this function is import on deploying mobile environment, the client patch is so expensive (app approval time and additional packaging cost, etc), it will be good avoiding client patch due to protocol field changing.

And also, with some extra processing, RESTful API URL could be generated.

Furthermore, one can verify a message from hijacked (partially modified) stream.

Message ID

Message id is used as GUID for distinguishing each protocol definition. For generating unique hash number or string, you could use well-known hash function, CRC-32 or custom implemented function if only it guarantees avoidance of collision(duplication).

What hash function is used, one could randomize the message string by customizing structure description.

String protection

The numeric variable is just ok to go, but as for string type, it’s somewhat risky not applying additional protection as hex code could be converted ASCII character and it means some other could decode binary stream as human readable string.

So, additional protection should be considered.

List or Map type

Let’s suppose you want to support list or map type. A list or map contains certain amount of elements. We need to write down total count at first so we could decode memory stream just fit whole amount of items as it.

If you define List variable with int32 type, you could notation it like below

List<int32> someList;

let’s add some values. ie. {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

The Variable ‘someList’ has 10 items and could be serialized as below.

Detailed Implementation

Coming Soon

still in writing down… below will be added sooner

compressed packaged stream
memory map of custom data type

To top

Leave a Reply