Difference between revisions of "Deserialize Raw Data"

From SICDB Doc
Jump to navigation Jump to search
Line 16: Line 16:
     ret=[]
     ret=[]
     for i in range(int(len(data)/4)):
     for i in range(int(len(data)/4)):
         if (data[i*4]==0 and data[i*4+1]==0 and data[i*4+2]==0 and data[i*4+3]==0): continue
         #if (data[i*4]==0 and data[i*4+1]==0 and data[i*4+2]==0 and data[i*4+3]==0): continue # uncomment this if you do not want to include null values
         ret.append(struct.unpack('<f',data[i*4:i*4+4])[0])
         ret.append(struct.unpack('<f',data[i*4:i*4+4])[0])
     return ret
     return ret
Line 33: Line 33:
                 buf[2] = data[i+2];
                 buf[2] = data[i+2];
                 buf[3] = data[i+3];
                 buf[3] = data[i+3];
                 if (buf[0] == 0 && buf[1] == 0 && buf[2] == 0 && buf[3] == 0) continue; // skip null values
                 // if (buf[0] == 0 && buf[1] == 0 && buf[2] == 0 && buf[3] == 0) continue; // uncomment this if you do not want to include null values
                 ret[i / 4] = BitConverter.ToSingle(buf); // note: if you are on a BigEndian machine you need to flip buf
                 ret[i / 4] = BitConverter.ToSingle(buf); // note: if you are on a BigEndian machine you need to flip buf
             }
             }

Revision as of 16:55, 3 November 2022

Why is this data serialized?

The intermediate data, where minute values are not aggregated, has roughly about 240gb, and it is to be expected that is increases by about 15% in 2023 update. By aggregating and serializing these values the database can be compresssed to under 10gb. Nevertheless it complicates the deserialization of this data. While our software can export the minute values one-click, using it in raw data requires a bit of coding.

Encoding

The raw data field is a stream of 60 little endian IEEE 754 floats, so it has exactly 240 bytes. The first 4 bytes represent the first minute of the hour and so on. Note that 0x000000 is defined to be NULL (no value).

Python example

 def GetRawValues(data):
   ret=[]
   for i in range(int(len(data)/4)):
       #if (data[i*4]==0 and data[i*4+1]==0 and data[i*4+2]==0 and data[i*4+3]==0): continue # uncomment this if you do not want to include null values
       ret.append(struct.unpack('<f',data[i*4:i*4+4])[0])
   return ret

C# example

       public static float[] GetRawValues(byte[] data)
       {
           byte[] buf = new byte[4];
           float[] ret = new float[data.Length / 4];
           for(int i = 0; i < data.Length; i += 4)
           {
               buf[0] = data[i];
               buf[1] = data[i+1];
               buf[2] = data[i+2];
               buf[3] = data[i+3];
               // if (buf[0] == 0 && buf[1] == 0 && buf[2] == 0 && buf[3] == 0) continue; // uncomment this if you do not want to include null values
               ret[i / 4] = BitConverter.ToSingle(buf); // note: if you are on a BigEndian machine you need to flip buf
           }
           return ret;
       }

Build up table