The Sims FreePlay’s SBIN File Format

One of the data structures used in Sims FreePlay android game application is files with *.sb extention.

If you try to open it in the hex editor, you will find that it is started with SBIN signature, followed by other blocks of data, started with with other signature, such as STRU, FIEL, ENUM, OHDR, DATA, etc.

Apparently the 4 bytes signature is the marker for the block of data, and close examination will reveal that each block has 3 common data type which I will call the signature, block length and the checksum.

Let’s take the STRU block below for some illustration. This is taken from one of the *.sb file called locales.sb as follows:

sb01

From the above picture, each block contains 3 (three) data item, each of 4 bytes size. The first one (red box) is the signature, here the 0x53545255 is STRU, the second one (blue box) is size of the data inside of this block in bytes, here the size is 0x0C, the third (green box) is some kind of checksum value of the data block.

If you try to change just one byte of the data portion, the checksum value will be different and it is up to the game developer here what to do when this condition occurs, for example, it will only show blank data for some text information.

By reading the signatures block, you may get some idea what data that is contained in the file. It should contain structure, fields, field names, and the actual data.

So, how to describe each block into structures, fields, header name and data ?

Let’s examine the data portion of STRU block as follows:

sb02

The size of each STRU data is 6 (six) bytes, and from the above example it contains 2 (two) structures, and apparently contain 3 (three) data types, each has the size of 2 (two) bytes.

Take the first structure information (red box), the second item is used to determine the location of the field and the third item is total field number in the structure.

So, for the first structure (red box), the field location is 0x0000, this is multiplied by 0x8 which is the size of field data block to get the pointer position if first field data. From the information of third data item, the first structure has 2 fields, i.e. 0x0002.

The field information of second structure (blue box) is located at offset 0x0002 times 0x8 which is 0x10, and it has 0x0009 or 9 fields.

To determine the pointer location of fields information, just add the offset with the starting of the data portion of the FIEL block which is just after 4 bytes checksum. This is illustrated by the picture below:

sb03

For the above sample, the start of field data block is located at offset 0x38, and each has the size of 8 (eight) bytes as denoted by red box. This is the first field of first structure.

To determine the first field of second structure, from the above calculation, the offset 0x38 should be added with 0x10 and we arrive at:

sb04

As denotes by blue box, this is the first field of second structure. And the field information contains the 2 bytes field sequence followed by another 2 bytes that denotes the field type. The field type will determine how the structure reader obtained the data for each field.

Take the first field inside first structure, the field sequence number is 0x0002. This sequence is related to the data portion inside CHDR structure, which is used to determine the field name.

Below is the picture of CHDR structure:

sb05

The 0x43484452 (CHDR) actually the signature, it has 0x198 bytes length and 0x4A35B08C is checksum value for the data block. After the check some, here comes the CHDR detail structure, which each has 8 (eight) bytes.

From the above picture, the red box is for sequence zero, blue box is sequence 1 (one) and so on. You can see that sequence 0 (zero) contains no information. The first information actually resides on second record (blue box) which is sequence 1.

Each CHDR detail structure consists of 4 bytes that denotes offset location of the field name inside CDAT structure, and 4 bytes that denotes field name length.

Take the offset location of sequence 1 which is 0x00000001. This value is added to the start of data portion of CDAT to get the name of the sequence.

Below is picture of CDAT structure:

sb06

From the above sample, the 0x43444154 (CDAT) is signature, followed by 4 bytes length (0x00000195) and 4 bytes checksum (0x4FD442C6), so we arrive at offset 0x384 as the start position of field names information. This is added by 0x01 obtained from CHDR structure, and we arrived at offset 0x385, and by the 0x08 bytes as its length, we get the field name information (red box) which is “Language”.

So, what’s the name of field sequence 0x02 and 0x03 ? The answer is “languageid” and “localeid”. I will leave the task to derive this conclusion for the reader 🙂

The combination of bytes inside OHDR and DATA block is used to locate the detail records for each table. Below is the part of OHDR structure:

sb07

The value 0x4F484452 is OHDR signature, it has 0x0C length, and just after the check sum, there are 4 bytes (0x00000001) which denotes the header type. This value is used to calculate offset from start of DATA record to get first detail table information. If the type 0x00, the offset is 2 bytes, 0x01, the offset is 4 bytes, 0x02 is 8 bytes.

Below is the picture of DATA structure:

sb08

As usual, the value 0x44415441 is DATA signature. The start of data information is located at offset 0xB4 and because OHDR header type is 0x01, it is added 0x4 bytes to get the location of first table information block, as denoted by the blue box above.

The table information block has similar structure to the field information block, so you can see that bytes at offset 0xB8 to 0xB9 (0x0B) is table name sequence number which is “locales” table.

The 4 bytes at offset 0xBC to 0xBF (0x00000C) is the offset location to get the table id for this table. This value is added to the start of DATA structure which is 0xB4 to get 0xC0.

Below is byte information at address location 0xC0:

sb09

The blue box is the table sequence number which will be used to locate the first data block or record of the related table, for this example is first record of locales table.

The red box denotes the next table information block, which has 0x0E as the sequence number which is “languages” table. Again, offset value 0x00000018 is added to 0xB4 to get 0xCC which is 0x00000002 or the table sequence number for “languages” table.

The table sequence number is then multiplied by 4 to get the location of table record detail. So for id 0x01 we get 0x04. This value is added to the start of OHDR data to obtain information about the size of detail records for the table in question:

sb10

From the above picture, 0x4F484452 is OHDR, and you already seen that 4 bytes started at offset 0x9C is the start of header information block. So the calculated 0x4 is added to the offset 0x9C, and we arrived at 0xA0 (blue box).

It has 0xE2, and to get the location of first record data, this value should be divided by 0x8, so we get 0x1C. The value 0x1C is added to the start of DATA block offset (0xB4) so we get 0xD0:

sb11

For each block of data record, it has 0x8 bytes prefix (red box) which contains information about the data record block such as total number of record such as 0x00000B, at offset 0x4 thru 0x7. The actual data starts at offset 0xD8.

To get the length of each record, we should consult the offset information for the last field of the table:

sb12

As you can see from the picture above, offset information is located at 0x8C thru 0x8D, so we get 0x0010. This should be added to the data size of the last field that could be 0x2, 0x4 or 0x8 bytes, depending on the field type information located at offset 0x8A thru 0x8C (0x000C). In this case, the size is 0x02 so the actual record length is 0x12.

Here is some information about the field type and its size of last byte:

0x03,0x04,0x0C,0x0D,0x14,0x15 -> last byte size = 0x2

0x05,0x06,0x0A,0x0F,0x11,0x12,0x13,0x16 -> last byte size = 0x4

0x07,0x08,0x0B -> last byte size = 0x8

The string type (0x0D) inside the detail record is treated as the same with methods to find the field name, which is in the form of sequence number, for example, let’s view the first detail record for “locales” table:

sb13

For the first field data information 0x000F, which is localeid, it is treated as a sequence number and referenced using CHDR structure to get string data from CDAT structure, in this case, 0x000F is “en”.

With all of the relevant information is revealed, it is possible to manually traverse the detail record for this format, or creates some program to do this task 🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: