RLE Compression

From ModdingWiki
(Redirected from Run length encoding)
Jump to navigation Jump to search
RLE Compression
Format typeCompression algorithm
TypeStream
I/O unit size1-byte
GamesN/A

Run-Length Encoding (RLE) is a form of lossless compression, and one of the simplest ways of compressing and storing data. It is used in various forms by games on nearly every platform. The essence of this compression method is to compress data by saving the "run-length" of the encountered values, which boils down to storing the data as a collection of length/value pairs. To take a simple case, the string 'aaaaaaaabbb' could be compressed to '8a3b' (8 bytes of a, 3 bytes of b). The trick with RLE is to make sure the data does not end up becoming larger after processing. For example, using pure length/value pairs, 'abracadabra' would become '1a1b1r1a1c1a1d1a1b1r1a'. Variations were designed to get around this problem.

Most systems also tell the program beforehand the size of the decompressed data, stored as a word or dword at the start of the data block. This is usually done so that the correct amount of memory can be allocated before the data is processed. In modern programs this is generally considered a security risk (it is trivial to write past the end of the allocated buffer), but this was not considered an issue at the time the games were written. Smart implementations can instead abort when going beyond that size, and use it as consistency check to ensure the data is not corrupted.

Different systems

There are three main systems of RLE, divided by how the length and values are specified.

Length/Value

In this system, all data is stored in pairs of values, with one being the value to repeat, and the other the amount of times to repeat that value. The order in which these two are stored varies in different implementations. So using the simple system above, the string '3a4b1c' would be read as 'Length 3 of a, length 4 of b, length 1 of c', and become 'aaabbbbc' when decompressed. This system is the simplest, but, as shown in the "abracadabra" example above, it has the disadvantage that it greatly expands blocks of non-repeating data by adding a repeat value of '1' to each one.

Flag

In the second system a 'flag' is used. This is a special value that tells a program to do something whenever it is encountered. So the string '$4abcd$5e', with '$' as the flag value indicating that a length/value pair is coming up next, could be decompressed to 'aaaabcdeeeee'. Flags may indicate a number of things (Usually a length/value pair) and can be words, bytes or bits (E.g all values over 128 are length bytes, all under value bytes). This is more complex and is less efficient space-wise than the first system, but has the advantage that the data can usually be read from multiple points and an error may not corrupt all following data (since every flag 'resets' the decompression).

A flag-based system should contain a way of storing the literal version of its flag value. Typically, for '$' as flag, this will simply be stored as '$1$', since the value to repeat will never be interpreted as flag. However, some systems may use some form of escaping to prevent such bytes from expanding the data too much. Since the flag itself should be the only value to ever be repeated only once, and repeat values of 0 or 1 have no real purpose, either '$0' or '$1' could indicate such an escaped '$' literal, reducing the stored value from three values to just two. With such a system in place, the data 'aaaa$bcdeeeee' could be compressed to '$4a$0bcd$5e'. However, such escaping seems to be rarely used in actual compression schemes. To solve the inefficiency of having an unused 0-value, though, the read repeat-value is sometimes incremented by 1. If no such escaping or incrementing is done, and a flag followed by 0 simply makes it write the specified value 0 times, this could be used to embed hidden data into the files.

Other variations exist; in algorithms that only collapse the transparent area around a sprite, the transparency value itself can serve both as flag and as the sole possible value to repeat, allowing flag blocks of only two values; the flag and the repeat length. That kind of data can usually be fed straight into a blitter mechanism for drawing it into a screen buffer.

Code

The most commonly used type of RLE is the Code type. A 'Code', also called 'Token', is a value which contains two things: a Command, and a Value. The Command indicates if the following data is a single piece to be repeated or a range to be copied as it is. The Value indicates how many times to repeat the data, or how much data to copy, before reading a new Code behind the processed data. The typical implementation of this system is to use one single byte as Code, where the highest bit is 0 or 1 depending on the Command, and the Value is made from the remaining 7 bits, allowing a Value range of 0 to 127.

Sometimes, the normally-useless codes that would indicate repeating or copying 0 bytes are given special additional functions, like reading additional bytes to get a larger repeat/copy value, or are simply compensated by incrementing the 7-bit Value that is read. This increment is usually 1, but sometimes 2 for the Repeat command, since a repeat of 1 is identical to a copy of 1. As with flag-based RLE, if no such compensation systems are in place, and a command to repeat a value 0 times is accepted, this could be used to embed hidden data into the files.

This type cannot correct itself like the Flag type, but has the advantage that there are no restrictions on the type of data that can be compressed, and no escaping is ever needed, since the decompression will never try to interpret non-code bytes as codes. Its main advantage over Length/Value is that it only needs a single code to process a whole range of uncompressed data.

Compression units

RLE compression can occur at several levels; bits, bytes or words. The used type often depends on what kind of data is being compressed. For example, most tilesets work on the byte level (1 pixel = 1 byte) and so will be compressed at the byte level, while a lot of level formats are made up of byte pairs for tiles, and would compress better at the word level. In most cases, the same compression algorithm is used for multiple file types, and thus byte level is preferred.

Types of RLE

  • Supplemental data in Duke Nukem II map files uses a Code-type RLE which compresses the data per block of four 2-bit values.
  • The Executioners RLE Format is an image format using transparency-collapsing Code-based RLE, which does not have values for its Repeat commands, since it skips bytes instead of repeating them.
  • The Incredible Machine Image Format uses an extended code-based RLE with four different commands, which compresses 4-bit pixel data, but which works with full bytes. Its transparency handling is independent from the stored 4-bit data, meaning it can technically store data containing both transparent area and the full range of 16 colours.
  • Jazz Jackrabbit RLE compression: a Code-type RLE with a specific stop command.
  • Keen 1-3 RLE compression also used by Dangerous Dave and Catacomb: a Code-type RLE which adds 1 to the Value.
  • The image format of King Arthur's K.O.R.T. uses a very simple Length/Value RLE with the run-length stored behind the value.
  • Monster Bash RLE used in .DAT file: a Flag-type RLE.
  • id Software RLEW compression: a Flag-type RLE which reads its data in Word values (two bytes).
  • Treasure Mountain Panning Image Format uses a flag-based RLE for its image compression, which uses a double flag to escape a single occurrence of the flag value itself.
  • The Visage Format used by Mythos Software games uses two types of RLE; a Flag-based one, and a transparency-collapsing one which is a variation of Code-based RLE that alternates between Repeat and Copy commands instead of specifying commands in the data.
  • Westwood LCW is an advanced code-type RLE compression with three extra commands to copy previously-decompressed blocks.
  • Westwood RLE: a Code-type RLE where a Copy-command with a value of 0 bytes serves as a second type of Repeat-command, which reads an additional 16-bit value as amount to repeat.
  • Westwood RLE-Zero is a transparency-collapsing Flag based RLE with the value 0 as both the flag and the only possible value to be repeated.
  • Westwood XOR Delta is technically not a compression format but a way to store the differences between two consecutive graphics frames. The way its data is stored is very similar to code-type RLE, but with a much larger set of commands.

Source code

C#

The following code will decompress/compress raw data using the Code format, with the highest bit in the code byte indicating the command, and the remaining bits indicating the amount to copy or repeat. The class is made so the MaxRepeatValue and MaxCopyValue constants and the GetCode and WriteCode functions can be overridden in subclasses, to easily implement variations on the standard behaviour. An RleCompressionHighBitRepeat and an RleCompressionHighBitCopy class are added as basic implementations of the abstract RleImplementation<T> class.

This type of RLE compression, with the highest bit set for the Repeat code, is the one used in all Sierra/Dynamix games.

This code was written by Nyerguds for the Engie File Converter, and is released under the WTF Public License. It achieves better compression than most other methods, because it only switches from a Copy command to a Repeat command if more than two repeating bytes are found (or if the maximum copy amount is reached). This avoids adding extra Copy command bytes after replacing two repeating bytes by a two-byte Repeat command.

#region Implementation

/// <summary>
/// Basic implementation of Run-Length Encoding with the highest bit set for the Repeat code.
/// The used run length is always (code &amp; 0x7F).
/// </summary>
public class RleCompressionHighBitRepeat : RleImplementation<RleCompressionHighBitRepeat> { }

/// <summary>
/// Basic implementation of Run-Length Encoding with the highest bit set for the Copy code.
/// The used run length is always (code &amp; 0x7F).
/// </summary>
/// <remarks>This uses the original GetCode/WriteCode functions but simply flips their "Repeat" boolean.</remarks>
public class RleCompressionHighBitCopy : RleImplementation<RleCompressionHighBitCopy>
{
    protected override Boolean GetCode(Byte[] buffer, ref UInt32 inPtr, ref UInt32 bufferEnd, out Boolean isRepeat, out UInt32 amount)
    {
        Boolean success = base.GetCode(buffer, ref inPtr, ref bufferEnd, out isRepeat, out amount);
        isRepeat = !isRepeat;
        return success;
    }

    protected override Boolean WriteCode(Byte[] bufferOut, ref UInt32 outPtr, UInt32 bufferEnd, Boolean forRepeat, UInt32 amount)
    {
        return base.WriteCode(bufferOut, ref outPtr, bufferEnd, !forRepeat, amount);
    }
}
#endregion

#region Main RLE class

/// <summary>
/// Basic code-based (or token-based) run-length encoding algorithm.
/// Written by Maarten Meuris, aka Nyerguds.
/// This class allows easy overriding of the code to read and write codes, to
/// allow flexibility in subclassing the system for different RLE implementations.
/// </summary>
/// <typeparam name="T">
/// The implementing class. This trick allows access to the internal type and its constructor from static functions
/// in the superclass, giving the subclasses access to static functions that still use the specific subclass behaviour.
/// </typeparam>
public abstract class RleImplementation<T> where T : RleImplementation<T>, new()
{
    #region overridables to tweak in subclasses
    /// <summary>Maximum amount of repeating bytes that can be stored in one code.</summary>
    protected virtual UInt32 MaxRepeatValue { get { return 0x7F; } }
    /// <summary>Maximum amount of copied bytes that can be stored in one code.</summary>
    protected virtual UInt32 MaxCopyValue { get { return 0x7F; } }

    /// <summary>Worst case output buffer size for compressed content, calculated from input data size.</summary>
    protected virtual UInt32 CompressionWorstCase(UInt32 inputSize)
    {
        // Worst-case for this function is probably alternating blocks of 1 non-repeating and 3 repeating,
        // which would expand 4 bytes to 6; 3/2, or 150%. Just to be safe, the buffer is set to 7/4, or 175%.
        // This technically depends on the WriteCode implementation, but in general this should be okay.
        return inputSize * 7 / 4;
    }

    /// <summary>
    /// Reads a code, determines the repeat / copy command and the amount of bytes to repeat / copy,
    /// and advances the read pointer to the location behind the read code.
    /// </summary>
    /// <param name="buffer">Input buffer.</param>
    /// <param name="inPtr">Input pointer.</param>
    /// <param name="bufferEnd">Exclusive end of buffer; first position that can no longer be read from. Writable because some compression types might want to force the process to end depending on the code.</param>
    /// <param name="isRepeat">Returns true for repeat code, false for copy code.</param>
    /// <param name="amount">Returns the amount to copy or repeat.</param>
    /// <returns>True if the read succeeded, false if it failed.</returns>
    protected virtual Boolean GetCode(Byte[] buffer, ref UInt32 inPtr, ref UInt32 bufferEnd, out Boolean isRepeat, out UInt32 amount)
    {
        if (inPtr >= bufferEnd)
        {
            isRepeat = false;
            amount = 0;
            return false;
        }
        Byte code = buffer[inPtr++];
        isRepeat = (code & 0x80) != 0;
        amount = (UInt32)(code & 0x7f);
        return true;
    }

    /// <summary>
    /// Writes the repeat / copy code to be put before the actual byte(s) to repeat / copy,
    /// and advances the write pointer to the location behind the written code.
    /// </summary>
    /// <param name="bufferOut">Output buffer to write to.</param>
    /// <param name="outPtr">Pointer for the output buffer.</param>
    /// <param name="bufferEnd">Exclusive end of buffer; first position that can no longer be written to.</param>
    /// <param name="forRepeat">True if this is a repeat code, false if this is a copy code.</param>
    /// <param name="amount">Amount to write into the repeat or copy code.</param>
    /// <returns>True if the write succeeded, false if it failed.</returns>
    protected virtual Boolean WriteCode(Byte[] bufferOut, ref UInt32 outPtr, UInt32 bufferEnd, Boolean forRepeat, UInt32 amount)
    {
        if (outPtr >= bufferEnd)
            return false;
        if (forRepeat)
            bufferOut[outPtr++] = (Byte)(amount | 0x80);
        else
            bufferOut[outPtr++] = (Byte)(amount);
        return true;
    }
    #endregion

    #region static functions
    /// <summary>
    /// Decodes RLE-encoded data.
    /// </summary>
    /// <param name="buffer">Buffer to decode.</param>
    /// <param name="abortOnError">If true, any found command with amount "0" in it will cause the process to abort and return null.</param>
    /// <returns>A byte array of the given output size, filled with the decompressed data.</returns>
    public static Byte[] RleDecode(Byte[] buffer, Boolean abortOnError)
    {
        T rle = new T();
        Byte[] bufferOut = null;
        rle.RleDecodeData(buffer, null, null, ref bufferOut, abortOnError);
        return bufferOut;
    }

    /// <summary>
    /// Decodes RLE-encoded data.
    /// </summary>
    /// <param name="buffer">Buffer to decode.</param>
    /// <param name="startOffset">Start offset in buffer.</param>
    /// <param name="endOffset">End offset in buffer.</param>
    /// <param name="abortOnError">If true, any found command with amount "0" in it will cause the process to abort and return null.</param>
    /// <returns>A byte array of the given output size, filled with the decompressed data.</returns>
    public static Byte[] RleDecode(Byte[] buffer, UInt32? startOffset, UInt32? endOffset, Boolean abortOnError)
    {
        T rle = new T();
        Byte[] bufferOut = null;
        rle.RleDecodeData(buffer, startOffset, endOffset, ref bufferOut, abortOnError);
        return bufferOut;
    }

    /// <summary>
    /// Decodes RLE-encoded data.
    /// </summary>
    /// <param name="buffer">Buffer to decode.</param>
    /// <param name="startOffset">Start offset in buffer.</param>
    /// <param name="endOffset">End offset in buffer.</param>
    /// <param name="decompressedSize">The expected size of the decompressed data.</param>
    /// <param name="abortOnError">If true, any found command with amount "0" in it will cause the process to abort and return null.</param>
    /// <returns>A byte array of the given output size, filled with the decompressed data.</returns>
    public static Byte[] RleDecode(Byte[] buffer, UInt32? startOffset, UInt32? endOffset, Int32 decompressedSize, Boolean abortOnError)
    {
        T rle = new T();
        return rle.RleDecodeData(buffer, startOffset, endOffset, decompressedSize, abortOnError);
    }

    /// <summary>
    /// Decodes RLE-encoded data.
    /// </summary>
    /// <param name="buffer">Buffer to decode.</param>
    /// <param name="startOffset">Start offset in buffer.</param>
    /// <param name="endOffset">End offset in buffer.</param>
    /// <param name="bufferOut">Output array. Determines the maximum that can be decoded. If the given object is null it will be filled automatically.</param>
    /// <param name="abortOnError">If true, any found command with amount "0" in it will cause the process to abort and return null.</param>
    /// <returns>The amount of written bytes in bufferOut.</returns>
    public static Int32 RleDecode(Byte[] buffer, UInt32? startOffset, UInt32? endOffset, ref Byte[] bufferOut, Boolean abortOnError)
    {
        T rle = new T();
        return rle.RleDecodeData(buffer, startOffset, endOffset, ref bufferOut, abortOnError);
    }

    /// <summary>
    /// Applies Run-Length Encoding (RLE) to the given data.
    /// </summary>
    /// <param name="buffer">Input buffer.</param>
    /// <returns>The run-length encoded data.</returns>
    public static Byte[] RleEncode(Byte[] buffer)
    {
        T rle = new T();
        return rle.RleEncodeData(buffer);
    }
    #endregion

    #region public functions
    /// <summary>
    /// Decodes RLE-encoded data.
    /// </summary>
    /// <param name="buffer">Buffer to decode.</param>
    /// <param name="startOffset">Inclusive start offset in buffer. Defaults to 0.</param>
    /// <param name="endOffset">Exclusive end offset in buffer. Defaults to the buffer length.</param>
    /// <param name="decompressedSize">The expected size of the decompressed data.</param>
    /// <param name="abortOnError">If true, any found command with amount "0" in it will cause the process to abort and return null.</param>
    /// <returns>A byte array of the given output size, filled with the decompressed data, or null if abortOnError is enabled and an empty command was found.</returns>
    public Byte[] RleDecodeData(Byte[] buffer, UInt32? startOffset, UInt32? endOffset, Int32 decompressedSize, Boolean abortOnError)
    {
        if (buffer == null)
            throw new ArgumentNullException("buffer");
        Byte[] outputBuffer = new Byte[decompressedSize];
        Int32 result = this.RleDecodeData(buffer, startOffset, endOffset, ref outputBuffer, abortOnError);
        if (result == -1)
            return null;
        return outputBuffer;
    }

    /// <summary>
    /// Decodes RLE-encoded data.
    /// </summary>
    /// <param name="buffer">Buffer to decode.</param>
    /// <param name="startOffset">Inclusive start offset in buffer. Defaults to 0.</param>
    /// <param name="endOffset">Exclusive end offset in buffer. Defaults to the buffer length.</param>
    /// <param name="bufferOut">Output array. Determines the maximum that can be decoded.</param>
    /// <param name="abortOnError">If true, any found command with amount "0" in it will cause the process to abort and return -1.</param>
    /// <returns>The amount of written bytes in bufferOut.</returns>
    public Int32 RleDecodeData(Byte[] buffer, UInt32? startOffset, UInt32? endOffset, ref Byte[] bufferOut, Boolean abortOnError)
    {
        if (buffer == null)
            throw new ArgumentNullException("buffer");
        UInt32 inPtr = startOffset ?? 0;
        UInt32 inPtrEnd = endOffset.HasValue ? Math.Min(endOffset.Value, (UInt32)buffer.Length) : (UInt32)buffer.Length;

        UInt32 outPtr = 0;
        Boolean autoExpand = bufferOut == null;
        UInt32 bufLenOrig = inPtrEnd - inPtr;
        if (autoExpand)
            bufferOut = new Byte[bufLenOrig * 4];
        UInt32 maxOutLen = autoExpand? UInt32.MaxValue : (UInt32)bufferOut.Length;
        Boolean error = false;

        while (inPtr < inPtrEnd && outPtr < maxOutLen)
        {
            // get next code
            UInt32 run;
            Boolean repeat;
            if (!this.GetCode(buffer, ref inPtr, ref inPtrEnd, out repeat, out run) || (run == 0 && abortOnError))
            {
                error = true;
                break;
            }
            //End ptr after run
            UInt32 runEnd = Math.Min(outPtr + run, maxOutLen);
            if (autoExpand && runEnd > bufferOut.Length)
                bufferOut = ExpandBuffer(bufferOut, Math.Max(bufLenOrig, runEnd));
            // Repeat run
            if (repeat)
            {
                if (inPtr >= inPtrEnd)
                    break;
                Int32 repeatVal = buffer[inPtr++];
                for (; outPtr < runEnd; ++outPtr)
                    bufferOut[outPtr] = (Byte)repeatVal;
                if (outPtr == maxOutLen)
                    break;
            }
            // Raw copy
            else
            {
                Boolean abort = false;
                for (; outPtr < runEnd; ++outPtr)
                {
                    if (inPtr >= inPtrEnd)
                    {
                        abort = true;
                        break;
                    }
                    Int32 data = buffer[inPtr++];
                    bufferOut[outPtr] = (Byte)data;
                }
                if (abort)
                    break;
                if (outPtr == maxOutLen)
                    break;
            }
        }
        if (error)
            return -1;
        if (autoExpand)
        {
            Byte[] newBuf = new Byte[outPtr];
            Array.Copy(bufferOut, 0, newBuf, 0, outPtr);
            bufferOut = newBuf;
        }
        return (Int32)outPtr;
    }

    /// <summary>
    /// Applies Run-Length Encoding (RLE) to the given data.
    /// </summary>
    /// <remarks>
    /// This function achieves better compression than most other methods, because it only switches from a Copy command
    /// to a Repeat command if more than two repeating bytes are found (or if the maximum copy amount is reached).
    /// This avoids adding extra Copy command bytes after replacing two repeating bytes by a two-byte Repeat command.
    /// Written by Maarten Meuris, aka Nyerguds.
    /// </remarks>
    /// <param name="buffer">Input buffer.</param>
    /// <returns>The run-length encoded data.</returns>
    public Byte[] RleEncodeData(Byte[] buffer)
    {
        if (buffer == null)
            throw new ArgumentNullException("buffer");
        return this.RleEncodeData(buffer, 0, (UInt32)buffer.Length);
    }

    /// <summary>
    /// Applies Run-Length Encoding (RLE) to the given data.
    /// </summary>
    /// <remarks>
    /// This function achieves better compression than most other methods, because it only switches from a Copy command
    /// to a Repeat command if more than two repeating bytes are found (or if the maximum copy amount is reached).
    /// This avoids adding extra Copy command bytes after replacing two repeating bytes by a two-byte Repeat command.
    /// Written by Maarten Meuris, aka Nyerguds.
    /// </remarks>
    /// <param name="buffer">Input buffer.</param>
    /// <param name="dataStart">Start of the data inside the buffer.</param>
    /// <param name="dataEnd">End of the data inside the buffer.</param>
    /// <returns>The run-length encoded data.</returns>
    public Byte[] RleEncodeData(Byte[] buffer, UInt32 dataStart, UInt32 dataEnd)
    {
        if (buffer == null)
            throw new ArgumentNullException("buffer");
        UInt32 dataLen = (UInt32)buffer.Length;
        UInt32 inPtr = Math.Min(dataLen, dataStart);
        UInt32 outPtr = 0;
        // Retrieve these in advance to avoid extra calls to getters.
        // These are made customizable because some implementations support larger codes. Technically
        // neither run-length 0 nor 1 are useful for repeat codes (0 should not exist, 1 is identical to copy),
        // so the values are often decremented to allow storing one or two more bytes.
        // Some implementations also use these values as indicators for reading a larger value to repeat or copy.
        UInt32 maxRepeat = this.MaxRepeatValue;
        UInt32 maxCopy = this.MaxCopyValue;
        UInt32 len = Math.Min(dataLen, dataEnd);
        // This code does not do sanity checks, since some file formats can't disable their compression.
        UInt32 bufLen = this.CompressionWorstCase(len);
        Byte[] bufferOut = new Byte[bufLen];
        UInt32 detectedRepeat = 0;
        while (inPtr < len)
        {
            // Handle 2 cases: repeat was already detected, or a new repeat detect needs to be done.
            if (detectedRepeat >= 2 || (detectedRepeat = RepeatingAhead(buffer, len, inPtr, 2)) == 2)
            {
                // Found more than 2 bytes. Worth compressing. Apply run-length encoding.
                UInt32 start = inPtr;
                UInt32 end = Math.Min(inPtr + maxRepeat, len);
                Byte cur = buffer[inPtr];
                // Already checked these in the RepeatingAhead function.
                inPtr += detectedRepeat;
                // Increase inptr to the last repeated.
                for (; inPtr < end && buffer[inPtr] == cur; ++inPtr) { }
                // WriteCode is split off into a function to allow overriding it in specific implementations.
                // After the code is written, test if there is still a byte free to add the value to repeat. Shouldn't happen, but better be sure.
                if (!this.WriteCode(bufferOut, ref outPtr, bufLen, true, (inPtr - start)) || outPtr + 1 >= bufLen)
                    break;
                // Add value to repeat
                bufferOut[outPtr++] = cur;
                // Reset for next run
                detectedRepeat = 0;
            }
            else
            {
                Boolean abort = false;
                // if detectedRepeat is not greater than 1 after writing a code,
                // that means the maximum copy length was reached. Keep repeating
                // until the copy is aborted for a repeat.
                while (detectedRepeat == 1 && inPtr < len)
                {
                    UInt32 start = inPtr;
                    // Normal non-repeat detection logic.
                    UInt32 end = Math.Min(inPtr + maxCopy, len);
                    UInt32 maxend = inPtr + maxCopy;
                    inPtr += detectedRepeat;
                    while (inPtr < end)
                    {
                        // detected bytes to compress after this one: abort.
                        detectedRepeat = RepeatingAhead(buffer, len, inPtr, 3);
                        // Only switch to Repeat when finding three repeated bytes: if the data
                        // behind a repeat of two is non-repeating, it adds an extra Copy command.
                        if (detectedRepeat == 3)
                            break;
                        // Optimise: apply a 1-byte or 2-byte skip to ptr right away.
                        inPtr += detectedRepeat;
                        // A detected repeat of two could make it go beyond the maximum accepted number of
                        // stored bytes per code. This fixes that. These repeating bytes are always saved as
                        // Repeat code, since a new command needs to be added after ending this one anyway.
                        // If you'd use the copy max amount instead, the 2-repeat would be cut in two Copy
                        // commands, wasting one byte if another repeating range would start after it.
                        if (inPtr > maxend)
                        {
                            inPtr -= detectedRepeat;
                            break;
                        }
                    }
                    UInt32 amount = inPtr - start;
                    if (amount == 0)
                    {
                        abort = true;
                        break;
                    }
                    // Need to reset this if the copy commands aborts for full size, so a last-detected
                    // repeat value of 2 at the end of a copy range isn't propagated to a new repeat command.
                    if (amount == maxCopy)
                        detectedRepeat = 0;
                    // WriteCode is split off into a function to allow overriding it in specific implementations.
                    // After the code is written, test if there is still space to add the values to copy. Shouldn't happen, but better be sure.
                    abort = !this.WriteCode(bufferOut, ref outPtr, bufLen, false, amount) || outPtr + amount >= bufLen;
                    if (abort)
                        break;
                    // Add values to copy
                    for (UInt32 i = start; i < inPtr; ++i)
                        bufferOut[outPtr++] = buffer[i];
                }
                if (abort)
                    break;
            }
        }
        Byte[] finalOut = new Byte[outPtr];
        Array.Copy(bufferOut, 0, finalOut, 0, outPtr);
        return finalOut;
    }
    #endregion

    #region internal tools

    private static Byte[] ExpandBuffer(Byte[] bufferOut, UInt32 expandSize)
    {
        Byte[] newBuf = new Byte[bufferOut.Length + expandSize];
        Array.Copy(bufferOut, 0, newBuf, 0, bufferOut.Length);
        return newBuf;
    }

    /// <summary>
    /// Checks if there are enough repeating bytes ahead.
    /// </summary>
    /// <param name="buffer">Input buffer.</param>
    /// <param name="max">The maximum offset to read inside the buffer.</param>
    /// <param name="ptr">The current read offset inside the buffer.</param>
    /// <param name="minAmount">Minimum amount of repeating bytes to search for.</param>
    /// <returns>The amount of detected repeating bytes.</returns>
    protected static UInt32 RepeatingAhead(Byte[] buffer, UInt32 max, UInt32 ptr, UInt32 minAmount)
    {
        Byte cur = buffer[ptr];
        for (UInt32 i = 1; i < minAmount; ++i)
            if (ptr + i >= max || buffer[ptr + i] != cur)
                return i;
        return minAmount;
    }
    #endregion
}

#endregion