Westwood SHP Format (TD)

From ModdingWiki
Jump to: navigation, search
Westwood SHP Format (TD)
Westwood SHP Format (TD).png
Format typeTileset
HardwareVGA
Max tile count65535, though internal addressing limits the file size to 16 MiB.
PaletteShared. Technically possible to have it Internal but this is unused.
Tile names?No
Minimum tile size (pixels)1x1
Maximum tile size (pixels)65535×65535
Plane count1
Plane arrangementLinear
Transparent pixels?Palette-based
Hitmap pixels?No
Metadata?None
Supports sub-tilesets?No
Compressed tiles?Yes
Hidden data?Yes
Games

The sprite format used in Command & Conquer (also known as "Tiberian Dawn"), Red Alert and Sole Survivor is a collection of compressed 8-bit frames that all have the same dimensions. It can use two different compression algorithms, namely LCW and XOR Delta, and normally uses the most optimal one for each frame, though the use of XOR Delta is optional. The format technically supports an internal colour palette and an X and Y offset for the frames, but these options are ignored by the games that use it.

File format

Header

Offset Data type Name Description
0x00 UINT16LE Frames Number of frames in the file.
0x02 UINT16LE XPos X-offset. Should be ignored.
0x04 UINT16LE YPos Y-offset. Should be ignored.
0x06 UINT16LE Width Width of the frames.
0x08 UINT16LE Height Height of the frames.
0x0A UINT16LE DeltaSize Largest buffer size needed to decompress the frames.
0x0C BYTE[2] Flags Extra options. Bit 1 of this technically enables an embedded colour palette, but it is unused in the games.

The XPos and YPos are often set in the headers of original game files as a side effect of Westwood's conversion process, but they are not applied by the games, and should be ignored. An embedded colour palette will most likely also be ignored by the games, since SHP files are typically small sprites drawn on a scene that already has a palette set.

Frames info table

After the header comes an array of size Frames + 2, with 8-byte entries that each have the following structure:

Offset Data type Name Description
0x00 UINT24LE DataOffset A three-byte integer value giving the offset to the frame's compressed data. Since both the LCW and XOR Delta compressions end their data with specific end markers, no end offset is needed.
0x03 BYTE DataFormat The compression format in which the data at DataOffset is stored. See below.
0x04 UINT24LE ReferenceOffset Contains the referenced DataOffset or frame number in case XOR chaining is used. See below.
0x07 BYTE ReferenceFormat Reference format in case XOR chaining is used (see below)

If the palette flag is enabled, this table is followed by a 768-byte array containing a 256-colour 6-bit RGB VGA palette. After that follows the actual data referenced at the DataOffset addresses in the table. Unlike in WSA format, the palette is taken into account in the offsets, so no adjustments are needed on them.

There are three possible ways in which a frame can be stored. This is how the table entries look for each of these ways:

LCW Frame

The most straightforward frame type is an LCW-compressed frame.

  • DataOffset: points to the LCW data to uncompress to get the frame graphics.
  • DataFormat: set to 0x80, indicating LCW.
  • ReferenceOffset: irrelevant, and left empty.
  • ReferenceFormat: irrelevant, and left empty.

XOR Base Frame

In case the differences with a previously-saved LCW frame are minimal, XOR Delta compression is used to save instructions for transforming the previous frame's data into the new frame.

  • DataOffset: points to the XOR data to apply.
  • DataFormat: set to 0x40, indicating XOR Base.
  • ReferenceOffset: Contains the DataOffset of the referenced LCW frame.
  • ReferenceFormat: set to 0x80 (LCW). It is unknown if (and unlikely that) any of the other formats are supported as reference.

XOR Chain Frame

The third case, chained XOR, is a bit peculiar: it is an XOR with the immediately preceding XOR frame. It can only chain from either an XOR Base frame, or another XOR Chain frame.

  • DataOffset: points to the XOR data to apply.
  • DataFormat: set to 0x20, indicating XOR Chain.
  • ReferenceOffset: refers to the frame number of the XOR Base frame at the start of the chain.
  • ReferenceFormat: set to 0x48, indicating XOR Chain Reference.

Two final entries

As you see, the table has two more entries than the amount of frames. The first of these two extra entries normally serves as end point; its DataOffset contains the file length, and all its other values are set to 0. The final entry is normally completely zeroed out.

However, some games contain SHP files where the last entry contains the file size. In that case, the entry before that contains the information for a 'loop frame'. Loop frames are normally not used in SHP format; they were conceived for the WSA format, which, being purely based on XOR Delta compression, needed an extra XOR data entry as a smooth way to transform the final frame into the first one without needing to clear the graphics buffer. If such a loop frame exists in the SHP file, it should be ignored, and the file size should simply be taken from the last entry. It will most likely contain a duplicate of the first frame.

Implementation notes

The basic implementation is simple: compress the first frame as LCW, and for every following frame, attempt LCW, XOR Delta with a previous LCW frame, and (if the previous one is an XOR Delta) XOR chaining, and store the one which results in the lowest compressed size.

The existing files seem to work with a system of LCW "key frames"; XOR Base frames never refer to LCW frames before the previously-saved LCW frame. Note that after an XOR chain, it is perfectly possible to have another XOR Base frame referring to this last key frame, and multiple XOR chains can occur after a single key frame.

There also seems to be a rule applied to limit the length of XOR chains. Since the games do not sequentially preload SHP files, but instead interpret the data at the moment it is drawn to the screen, long chains can slow down the drawing process, since it needs to read, interpret and alter all chained frames in sequence. For example, the SAM Site SHP file in C&C1, when saved without chain limiting, would result in chains longer than 50 frames.

To limit this, the used strategy appears to be to only allow chaining if the cumulative size of all chained frames after the original XOR frame does not exceed the size of that original XOR frame.

This is just the saving strategy used in the original files, though; technically, the XOR Base frames can refer to any previous LCW frame in the file, and there is no length limit to XOR chains. The strategy seems to be a compromise between compression time, saved size and decompression time.

Tools

The following tools are able to work with files in this format.

Name PlatformView images in this format? Convert/export to another file/format? Import from another file/format? Access hidden data? Edit metadata? Notes
Engie File Converter WindowsYesYesYesNoN/A Uses the original algorithms and storage principles used by Westwood Studios, with further optimisations to avoid saving data of duplicate frames.
XCC Mixer WindowsYesYesYesNoN/A The de facto standard modding tool in the C&C community, but like most older tools, its LCW compression is not very good, and it does not use XOR Delta when saving this type.
Mix Manager DOSYesYesYesNoN/A The most commonly used modding suite back in the DOS days.
RAMIX DOSYesYesYesNoN/A The spiritual successor of Mix Manager, created when people started digging into Command & Conquer Red Alert.
Red Horizon Utilities Java (command line)NoYesYesNoN/A Original site is defunct. Backups of the tools can be found here.