Protocol Generator & Serializer
Abstract
in this document, I will descript conceptual protocol implementation which is used to connecting server and client. By defining procotol prototype as C++ grammar style, anyone who has experienced with C, will easily define protocol and readability is supported.
Lex/Yacc is used for tokenizing text and generating related loader.
Once the definition string is parsed, it also generates helper function for serializing structural memory contents to binary stream and vice versa.
By adding code format factory, any type of language serializer could be generated.
Supported Data Definition Type
Definition name | Data Size |
int8 | signed 8 bits |
uint8 | unsigned 8 bits |
int16 | signed 16 bits |
uint16 | unsigned 16 bits |
int32 | signed 32 bits |
uint32 | unsigned 32 bits |
int64 | signed 64 bits |
uint64 | unsigned 64 bits |
float | 32 bits float |
bool | Boolean |
double | signed 64 bits |
string | string |
User defined custom type | sizeof(user type) |
Overall Serializer Design
Basically, serializer is operated like below.
Let’s suppose the structure is defined as above. Each variable could be stored in memory sequentially. The numeric type is just fit in size but string type is some kind of dynamic size and so it requires length before actual string bytes.
Overall Protocol Format
Like many other open-protocols, it consists of several columns.
Total message length: it denotes overall message length.
Message ID: it is used for distinguishing each protocol.
Bytes Stream: serialized memory stream.
You can expand memory size for length and message id.
Once data stream is received, total message length is checked. If overall bytes are collected, then message id is used for peeking proper dispatcher. If not, it is pended for next stream.
Functional expansion
The introduced concept shows just fixed structure variables(every variables are included and serialized) but it could support dynamic inclusion. Why this function is import on deploying mobile environment, the client patch is so expensive (app approval time and additional packaging cost, etc), it will be good avoiding client patch due to protocol field changing.
And also, with some extra processing, RESTful API URL could be generated.
Furthermore, one can verify a message from hijacked (partially modified) stream.
Message ID
Message id is used as GUID for distinguishing each protocol definition. For generating unique hash number or string, you could use well-known hash function, CRC-32 or custom implemented function if only it guarantees avoidance of collision(duplication).
What hash function is used, one could randomize the message string by customizing structure description.
String protection
The numeric variable is just ok to go, but as for string type, it’s somewhat risky not applying additional protection as hex code could be converted ASCII character and it means some other could decode binary stream as human readable string.
So, additional protection should be considered.
List or Map type
Let’s suppose you want to support list or map type. A list or map contains certain amount of elements. We need to write down total count at first so we could decode memory stream just fit whole amount of items as it.
If you define List variable with int32 type, you could notation it like below
List<int32> someList;
let’s add some values. ie. {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
The Variable ‘someList’ has 10 items and could be serialized as below.
Detailed Implementation
Coming Soon
still in writing down… below will be added sooner
compressed packaged stream
memory map of custom data type
To top