Westwood RLE

From ModdingWiki
Jump to navigation Jump to search
Westwood RLE
Format typeCompression algorithm
TypeStream
I/O unit size1-byte
Games

Westwood Studios' RLE compression, also known as "Format 3" because it is usually indicated in file headers as compression method "3", and called "HORIZONTAL" in Westwood's source code[1], is a classic code-based RLE structure that uses codes where the highest bit indicates a Repeat command. For copy commands, the amount to copy is simply the Code value. For repeat commands, the amount to repeat is calculated with 0x100-Code.

Westwood RLE deviates from the classic one-byte copy/repeat code structure by assigning a special function to the normally-useless "copy 0 bytes" command (code 0x00). When this is encountered, the system instead reads the two following bytes as 16-bit integer, and performs a Repeat command using that value and the byte following that extended code.

This extended format suffers from endianness problems, though: on Amiga files, the 16-bit integer is little-endian, while PC files have it as big-endian, meaning files can't be decompressed without knowing their target system. In reality, however, this compression is only used in the CPS format, and the Amiga version of that format differs enough from the PC version to make this issue moot.

In earlier formats, like Westwood CMP, the compression could be applied vertically, because in some cases this resulted in a smaller compressed size. The fact the source code of later games refers to the standard RLE compression as "HORIZONTAL" is a leftover of this.

Example compression / decompression code

C#

This is written as subclass of the C# RLE basis on the main RLE article, overriding the read and write functions, and the maximum repeat and copy values required for the compression process. It integrates the word swap logic for supporting Amiga files.

/// <summary>
/// Westwood RLE implementation:
/// highest code bit set = followed by a single byte to repeat. Amount is (0x100-Code)
/// highest code bit not set = followed by range of non-repeating bytes. Amount to copy and skip is the code value.
/// Code is 0 = read 2 more bytes to get an Int16 amount, and perform a repeat command on the byte following that.
/// </summary>
public class WestwoodRle : RleImplementation<WestwoodRle>
{
    public override UInt32 MaxRepeatValue { get { return UInt16.MaxValue; } }
    public override UInt32 MaxCopyValue { get { return 0x7F; } }

    protected Boolean m_SwapWords;

    public WestwoodRle()
    {
        this.m_SwapWords = false;
    }

    public WestwoodRle(Boolean swapWords)
    {
        this.m_SwapWords = swapWords;
    }

    /// <summary>
    /// Decodes RLE-encoded data.
    /// </summary>
    /// <param name="buffer">Buffer to decode</param>
    /// <param name="startOffset">Start offset in buffer</param>
    /// <param name="endOffset">End offset in buffer</param>
    /// <param name="decompressedSize">The expected size of the decompressed data.</param>
    /// <param name="swapWords">Swaps the bytes of the long-repetition Int16 values, decoding them as little-endian.</param>
    /// <param name="abortOnError">If true, any found command with amount "0" in it will cause the process to abort and return null.</param>
    /// <returns>A byte array of the given output size, filled with the decompressed data.</returns>
    public static Byte[] RleDecode(Byte[] buffer, UInt32? startOffset, UInt32? endOffset, Int32 decompressedSize, Boolean swapWords, Boolean abortOnError)
    {
        WestwoodRle rle = new WestwoodRle(swapWords);
        return rle.RleDecodeData(buffer, startOffset, endOffset, decompressedSize, abortOnError);
    }

    /// <summary>
    /// Decodes RLE-encoded data.
    /// </summary>
    /// <param name="buffer">Buffer to decode</param>
    /// <param name="startOffset">Start offset in buffer</param>
    /// <param name="endOffset">End offset in buffer</param>
    /// <param name="bufferOut">Output array. Determines the maximum that can be decoded.</param>
    /// <param name="swapWords">Swaps the bytes of the long-repetition Int16 values, decoding them as little-endian.</param>
    /// <param name="abortOnError">If true, any found command with amount "0" in it will cause the process to abort and return -1.</param>
    /// <returns>The amount of written bytes in bufferOut</returns>
    public static Int32 RleDecode(Byte[] buffer, UInt32? startOffset, UInt32? endOffset, Byte[] bufferOut, Boolean swapWords, Boolean abortOnError)
    {
        WestwoodRle rle = new WestwoodRle(swapWords);
        return rle.RleDecodeData(buffer, startOffset, endOffset, bufferOut, abortOnError);
    }

    /// <summary>
    /// Applies Run-Length Encoding (RLE) to the given data.
    /// </summary>
    /// <param name="buffer">Input buffer.</param>
    /// <param name="swapWords">Swaps the bytes of the long-repetition Int16 values, encoding them as little-endian.</param>
    /// <returns>The run-length encoded data.</returns>
    public static Byte[] RleEncode(Byte[] buffer, Boolean swapWords)
    {
        WestwoodRle rle = new WestwoodRle(swapWords);
        return rle.RleEncodeData(buffer);
    }

    /// <summary>
    /// Reads a code, determines the repeat / skip command and the amount of bytes to repeat/skip,
    /// and advances the read pointer to the location behind the read code.
    /// </summary>
    /// <param name="buffer">Input buffer.</param>
    /// <param name="inPtr">Input pointer.</param>
    /// <param name="bufferEnd">Exclusive end of buffer; first position that can no longer be read from.</param>
    /// <param name="isRepeat">Returns true for repeat code, false for copy code.</param>
    /// <param name="amount">Returns the amount to copy or repeat.</param>
    /// <returns>True if the read succeeded, false if it failed.</returns>
    protected override Boolean GetCode(Byte[] buffer, ref UInt32 inPtr, ref UInt32 bufferEnd, out Boolean isRepeat, out UInt32 amount)
    {
        if (inPtr >= bufferEnd)
        {
            isRepeat = false;
            amount = 0;
            return false;
        }
        Byte code = buffer[inPtr++];
        isRepeat = ((code & 0x80) != 0 || code == 0);
        if (!isRepeat)
            amount = code;
        else if (code != 0)
            amount = (UInt32)(0x100 - code);
        else
        {
            // Westwood extension for 16-bit repeat values.
            if (inPtr + 2 >= bufferEnd)
            {
                amount = 0;
                return false;
            }
            amount = (UInt32)(m_SwapWords ? buffer[inPtr++] + (buffer[inPtr++] << 8) : (buffer[inPtr++] << 8) + buffer[inPtr++]);
        }
        return true;
    }

    /// <summary>
    /// Writes the copy/skip code to be put before the actual byte(s) to repeat/skip,
    /// and advances the write pointer to the location behind the written code.
    /// </summary>
    /// <param name="bufferOut">Output buffer to write to.</param>
    /// <param name="outPtr">Pointer for the output buffer.</param>
    /// <param name="bufferEnd">Exclusive end of buffer; first position that can no longer be written to.</param>
    /// <param name="forRepeat">True if this is a repeat code, false if this is a copy code.</param>
    /// <param name="amount">Amount to write into the repeat or copy code.</param>
    /// <returns>True if the write succeeded, false if it failed.</returns>
    protected override Boolean WriteCode(Byte[] bufferOut, ref UInt32 outPtr, UInt32 bufferEnd, Boolean forRepeat, UInt32 amount)
    {
        if (outPtr >= bufferEnd)
            return false;
        if (forRepeat)
        {
            if (amount < 0x80)
            {
                if (bufferEnd <= outPtr)
                    return false;
                bufferOut[outPtr++] = (Byte)((0x100 - amount) | 0x80);
            }
            else
            {
                if (bufferOut.Length <= outPtr + 2)
                    return false;
                Byte lenHi = (Byte)((amount >> 8) & 0xFF);
                Byte lenLo = (Byte)(amount & 0xFF);
                bufferOut[outPtr++] = 0;
                bufferOut[outPtr++] = m_SwapWords ? lenLo : lenHi;
                bufferOut[outPtr++] = m_SwapWords ? lenHi : lenLo;
            }
        }
        else
        {
            bufferOut[outPtr++] = (Byte)(amount);
        }
        return true;
    }
}

C

Notice that is actually uses Signed bytes for the command, the other code didn't and I think that was the main issue.

int cps_rle(unsigned char *src, unsigned char *dest, int len, bool swapWord)  // Source data, Dest Data, and source size.
{
 
  s8 com;   // s8 is a signed byte
  s16 count;// signed word or short.
  u8 col, debug;
  int DP, SP, i;
  DP=0;
  SP=0;
 
 
  while( SP< len)
  {
    com = src[SP];
    SP++;
    if (com > 0)  // Copy command
    {
        for(i=0;i<com; i++)
        {
            dest[DP] = src[SP];
            DP++; SP++;
        }
 
    }else
    {
        if(com < 0 ) //Command 2 < fill 2
        {
            count = -com;  // invert the count to make it a positive number.
 
        }else // Other wise, Command 1 Fill 1
        {
            // This fixes beholde1.cps for the amiga EOB1
            if( swapWord == true)
               count = (src[SP+1] * 256 )+ src[SP];
            else
               count = (src[SP] * 256 )+ src[SP+1];
 
            SP +=2;
 
        }
 
        col = src[SP];  // Grab the colour to fill with.
        SP++;
 
        for(i=0;i<count; i++)
        {
            dest[DP] = col;
            DP++;
        }
    }
 
  }
  return DP;  // This should be 64000.
}

References

  1. IFF.H in the Command & Conquer source code: HORIZONTAL, // Run length encoding (RLE).