Write serialize and deserialize functions for an array of strings

Problem

Write 2 functions to serialize and deserialize an array of strings. strings can contain any Unicode character.
Do not worry about string overflow.

input = ['abdcd', '4agasd-dsfafdas', 'hi there I love you']

output = serialize(input)

deserialize(output) = ['abdcd', '4agasd-dsfafdas', 'hi there I love you']



Basically, you need to decide how you want to encode your serialize messages so that you can deserialize it later.

For simplicity, I decided to encode the messages as below:

   [meta_data_length]>[meta_data with ',' delimiter][concatenated strings]

For simplicity, I use a special delimiter '-'. However, we can avoid using it if we fix the first length field to a fixed size such as 64 bits. Again, for simplicity, we will assume '>' will not be used in the data.

To ensure the data contains '>', we will use html.escape() and html.unescape() to encode and decode '>' in the data. (UPDATE: 2022-06-13) The original code had a bug of now being able to deserialize the data correctly when the data contains the same delimiter, '>' in the string.

Once the serialization format is defined, we can write two methods, according to the serialization format.

Here is the working python code.


Practice statistics:

15:00: to write up the code

8:00: to fix the logical error. Had to debug the code by executing it. end value for reading data was calculated incorrectly. It should be s+l instead of l itself.

UPDATE(2022-06-13): Solved the problem again. Had to spend time figuring out how to avoid the deserialization failure when the data string contains the delimiter for meta_data_length separation. 
After trying several things, I decided to escape the delimiter with html.escape().

Comments

Popular posts from this blog

Planting flowers with no adjacent flower plots

Stock price processing problem