MegaTech VOL Format

From ModdingWiki
Jump to navigation Jump to search
MegaTech VOL Format
Format typeArchive
Max filesUnlimited
File Allocation Table (FAT)Beginning
Filenames?No
Metadata?None
Supports compression?No
Supports encryption?No
Supports subdirectories?No
Hidden data?No
Games

The MegaTech VOL Format is an archive used by Cobra Mission to store all the game data. The following .VOL files are included with the game:

File format

There is no header. The format is simply a list of offsets, repeated until the file data begins:

Data type Name Description
UINT32LE offset Offset of this file, relative to start of archive file

Usually the above structure is repeated until it is 256 bytes long (with unused offsets as zero), but it can be smaller (pcm6.vol has a 64 byte header).

The last entry in the list is the file size, followed by zero offsets as needed to pad the list to the required length.

! Will the game accept files that lack this padding?

Instances

Graphics VOL files

  • CUT1.VOL
  • CUT2.VOL
  • CUT3.VOL
  • CUTA.VOL
  • ENM.VOL
  • ENMA.VOL
  • MAP.VOL
  • OPENING.VOL
  • PIC1.VOL
  • PIC2.VOL
  • PIC3.VOL
  • PICA.VOL

The listed VOL files contain the graphics for the game.

  • OPENING.VOL contains the graphics for the intro sequence
  • MAP.VOL contains the graphics for the inventory map items.

! Below are assumptions on the other files

  • CUT1 - CUTA : Cutscenes
  • ENM - ENMA : Enemy graphics
  • PIC1 - PICA : Inventory items (magazines and photos)

These VOL files contain graphics chunks with the following structure

GC Entry

  • 00 - 01 : UINT16LE signature 47 43 "GC"
  • 02 - 03 : UINT16LE Version 02.00
  • 04 - 05 : UINT16LE Flag that indicates whether a palette is present.
  • 06 - 07 : UINT16LE Points to the location of the sub-chunk table.
  • 08 - 0B : UINT32LE Number of sub-chunks in the entry. Very often only 1, but sometimes more.
  • 0C - 0D : UINT16LE Size of the chunk.
  • 0E : UINT8 Padding (always 0)
  • 0F : UINT8 Checksum; add up all the bytes from 00 to 0E and place the 2's complement of that sum here. To check, add all bytes from 00 to 0F and the result of that should be 0.

If the palette bit is set in the header, a palette will follow.

  • 10 - 2F : The colors are stored as an array[16] of UINT16LE. See MED.VOL for example code on how to read the palette.

After the palette there's a subchunk table with N+1 values, where N is the number of subchunks. Each chunk runs from offset[X] to offset[X+1]. For a single chunk file, this has two values.

  • (10 or 30) - xx : Offset pointers in the chunk stored as UINT32LE. The amount is indicated by the number of sub-chunks, plus one.
GC data chunks

The GC data chunks have their own 10-byte header. Preliminary documentation below (not validated yet!):

  • 00 : UINT8 Always 0xA4
  • 01 : UINT8 Probably a checksum.
  • 02 : UINT8 X offset to display at
  • 03 : UINT8 Y offset to display at
  • 04 : UINT8 Width of contained image
  • 05 : UINT8 Height of contained image
  • 06 - 07 : UINT16LE Subchunk size.
  • 08 - 09 : UINT16LE Always 0?

To follow along, there is a full working implementation at https://github.com/dascandy/cobra .

The data is stored compressed in a style that looks like LZ77 (implicit reference table building) mixed with LZ78 (explicit table building) and Huffman coding. It is a very messy format; it is very easy to get a detail in the implementation wrong.

For the decoder we will need two intermediate buffers for decoding, one called the "bit buffer" of 16 bits, and one called the "nibble buffer" of two 4-bit nibbles. We also need a third 1024-byte buffer to hold up to 256 4-byte samples, to be referenced.

To start, the decoder loads the bit buffer with 16 bits. Every time the last bit is removed from the bit buffer, it loads the next two bytes in the input stream as a 16-bit big endian number into the bit buffer. Bits are removed from the high end one at a time, and are removed until a Huffman code is matched.

The game uses the following Huffman table:

Huffman table
  • 00 : Copy from back
  • 01 : Copy with Skip Table
  • 10 : Skip single output
  • 110 : Copy and store
  • 1110 : Copy with Move Table
  • 1111 : Copy from backing store

The one simple command in this table is "Skip single output" - it moves the output pointer 4 bytes ahead not changing the output bytes.

Nibble buffer

The nibble buffer is loaded when a value is attempted to be read but the table is empty. This differs from the bit buffer which is eagerly loaded. After loading a byte, the low nibble is returned first, and the second time it's called the high nibble is returned.

Backing store

The backing store holds up to 256 entries of 4 bytes each. The table starts empty, and entries are placed from the start with an automatically increasing storage ID - the first one goes in slot 0, the second in slot 1 and so on. They are copied from the backing store a full entry at a time.

The "Copy and store" command takes 4 bytes from the input stream and copies them into the next entry in the backing store, and also copies them to the output buffer.

The "Copy from backing store" command takes a single byte from the input stream, and copies that entry from the backing store into the output buffer.

Copy from back

Most of the game's graphics refer to output previously generated. It can refer back to the previous line written or to the earlier bits in this line. The copy from back command takes a nibble of data from the nibble buffer to find out what operation to do. All three of these use the same offset lookup table: { -1, -2, -4, -8, 1, 0 }. It splits the values into three regions:

  • Below 4: It copies a single entry from the output buffer to the output buffer. It gets the offset from the offset lookup table.
  • Between 4 and 9: It loads a nibble to find out how many entries to copy. It will copy the loaded nibble count plus two, and it uses the offset table as the previous (with the original value minus 4).
  • 10 and above: It sets the number of entries to copy to the remaining entries in the line. It also uses the offset to find where to copy from.

Offsets of 1 and 0 are treated specially; they're more of a marker value. If the offset is 1, it copies from the adjacent pixel in the previous line instead. If the offset is 0, it skips ahead without writing new output.

Copy Skip

This command loads an additional nibble of data V to determine what exactly to do.

If V is 0, it reads yet another nibble, and outputs a sequence of 00 and FF bytes depending on the bits set in the nibble - from low bit to high bit, 0xFF for a 1 bit and 0x00 for a 0 bit.

If V is 15, it copies the corresponding entry from the previous line on the screen.

For other values of V, it treats it as a bitfield starting from the right. For each set bit, it copies a byte from the input to the output, and for each clear bit it skips ahead a byte in the output.

Copy Move

This command loads an additional nibble of data V to determine what exactly to do.

If V is 0, it reads a whole byte of input X. It uses the bottom 6 bits to determine how many entries to copy, increased by 18. The top two bits (shifted right by 6) are used as an index into the offset table mentioned above to find the offset from where to copy.

If V is 15, it reads a whole byte of input X. It uses the bottom 6 bits to determine how many entries to target, increased by 18. The top two bits are checked whether to copy the values; if it is not zero it only skips ahead. When they are zero, it copies as many values from the previous line into this line.

For other values of V, it treats it as a bitfield starting from the right. For each set bit, it copies a byte from the input to the output, and for each clear bit it copies the equivalent byte from the previous output word.

Converting the output data from planar

Each "entry" of 4 bytes covers 8 pixels, where each of the bytes contains a single bit for every pixel. As example, let's take 00 01 03 07 as example inputs:

  • 0000 0111 (byte 4)
  • 0000 0011 (byte 3)
  • 0000 0001 (byte 2)
  • 0000 0000 (byte 1)
  • ----------
  • 0000 08CE

Decoding Example

EMI.VOL

EMI.VOL contains the music for the game, it has 41 chunks of data. The first chunk is exactly 4096 bytes and contains the instruments.

Instrument Bank

There are 128 instruments in the chunk and each instrument is 32 bytes.

  • Bytes 0 - 7 of each instrument look like FM parameters. ! Figure out which parameter does what
  • Bytes 8 - 9 contain low numbers numbers from 0 - 3. ! Figure out if these are modulator and carrier wave parameters
  • Bytes 10 - 11 contain numbers between 0 and 16. ! Figure out what these parameters mean
  • Bytes 12 - 15 are all zeroes.
  • Byte 16 increments from 0 - 31, after that it decrements from 255 to 160. Changing this seems to change the instrument type. It looks like the first 32 instruments are percussion.
  • Byte 17 contains a number from 0 - 255. ! Figure out what this byte does.
  • Bytes 18 - 23 contain zeroes.
  • Byte 24 - 31 contains a lot of data, but changing this to all zeroes seemingly does not change the music in-game.

The 40 remaining chunks all have the EM header.

EM Entry

  • 00 - 03 : signature 45 4D 00 06
  • 04 - 05 : Unknown, most of the times 00 00, but values such as 00 03 are also present.! Version?
  • 06 - 07 : UINT16LE, Start index of title, most of the times 1C 00
  • 08 - 09 : UINT16LE, Offset in the EM chunk where the actual data starts
  • 0A - 0B : UINT16LE, Always the same as the previous UINT16LE. (Pointing to data offset)
  • 0C - 0D : UINT16LE, The size of the EM chunk
  • 0E - 0F : Unknown, could be initial tempo or volume or something similar. ! Figure out what this is

After this 6 UINT16LE follow, that indicate the offset in the chunk where data is found for each channel.

6 channels are supported and if there is no data for the channel, the value is 0.

  • 10 - 11 : UINT16LE, Index of track 1 in chunk, usually the same as the UINT16LE from 08-09 and 0A-0B.
  • 12 - 13 : UINT16LE, Index of track 2 in chunk
  • 14 - 15 : UINT16LE, Index of track 3 in chunk
  • 16 - 17 : UINT16LE, Index of track 4 in chunk
  • 18 - 19 : UINT16LE, Index of track 5 in chunk
  • 1A - 1B : UINT16LE, Index of track 6 in chunk
  • 1C : Start of title, if there is no title, the value at this location is FF, otherwise it's FE.

If there is a title, a zero-terminated string follows, it contains japanese characters and at the end a date when it was converted. Finally, it ends with FF.

Pattern data

! Highly experimental reverse-engineering, https://github.com/BlackStar-EoP/cobra-mission-writer/tree/master/emidump is an attempt to convert these to S3M

So far, it seems the pattern data has

  • 2 byte control commands
  • 1 byte control commands
  • 1 byte speed commands
  • note commands

! Check if there are more types, such as 3 or 3 byte commands

2 byte control commands
  • 0xE4: Unknown
  • 0xE7: Unknown
  • 0xE8: Unknown
  • 0xFB: Unknown
  • 0xFD: Select instrument, byte after this indicates the index of the instrument from the instrument bank.
  • 0xF7: Unknown
  • 0xFA: Unknown
  • 0xFC: Unknown
1 byte control commands
  • 0xCD: Unknown
  • 0xCF: Unknown
  • 0xDC: Unknown
  • 0xDD: Unknown
  • 0xCE: Unknown
  • 0xF8: ! Loop command?
1 byte speed commands

These commands look like a delay command of some sort ! It seems these commands can do more than just have a delay for a note, looks like they can delay for a couple of notes and then return to another speed.

  • 0x81: row delay = 2
  • 0x82: row delay = 4
  • 0x83: row delay = 8
  • 0x84: row delay = 16
  • 0x85: row delay = 32
  • 0x86: row delay = 64
  • 0x89: row delay = 12
  • 0x90: row delay = 6
  • 0x8B: row delay = 48
  • 0x8E: row delay = 3
1 byte note commands

The note commands range from 0x00 to 0x7F, where 0x00 is a C, 0x01 is a C-# and so on. It seems that 0x00 is used as a dummy note, it does not get played, but increases the delay.

MCG.VOL

MCG.VOL contains the tilesets for the game. It contains 14 chunks of tile data.

Chunk nr Tileset
0 Character sprites
1 Weapon sprites
2 CASTLE tileset
3 AREA1 tileset
4 AREA2 tileset
5 AREA3 tileset
6 AREA4 tileset
7 AREA5 tileset
8 HOUSE tileset
9 BAR tileset
10 CORP tileset
11 BRICK tileset
12 LAB tileset
13 SEASIDE tileset

The header for this file is 64 bytes and contains the offsets for each tileset. The tilesets consist of a number of tiles, all are 32x32 pixels, each pixel stored as a byte, making each sprite 1024 bytes.

The palette for the tilesets is found inside each entry inside MED.VOL. The colors are stored as an array[16] of UINT16LE. For example white is stored as F0FF, this will be read as 0x0FFF, and here the components are 0GRB.

byte r = (value >> 4) & 0x0F;
byte g = value >> 8;
byte b = value & 0x0F;

This will result in values from 0 to 15. Getting the actual color requires 2 shifts, one left of 4 and added to that one right of 2.

Reading all palette entries would look something like this:

Color parse_color(uint16_t col)
{
	byte r = (col >> 4) & 0x0F;
	byte g = col >> 8;
	byte b = col & 0x0F;
	return Color((r << 4) + (r >> 2), (g << 4) + (g >> 2), (b << 4) + (b >> 2));
}
void parse_palette()
{
	int pal = 0;
	for (int i = 32; i < 64; i += 2)
	{
		uint16_t color = (m_data[i + 1] << 8) | m_data[i];
		m_palette[pal++] = parse_color(color);
	}
}

MED.VOL

MED.VOL contains all the levels for the game. It contains 54 chunks of map data.

MD Entry

Each MD entry inside MED.VOL consists of 5 blocks. The header (96 bytes), a block of unknown data (variable size), the tiledata for all the floors(dependent on size and number of floors), the triggers for all floors (same as the floor sizes, but with most entries 0) and a footer of 256 bytes.

MD Header
  • 0 - 6 : signature 4D 44 03 30 02 02 14
  • 7 - 9 : Unknown, most of the times 77 60 00, but values 9F 60 00, FF 60 00 and 8B 60 00 are also found in a few maps
  • 0A - 0B : UINT16LE, the offset inside this MD chunk where the actual map data starts
  • 0C - 0F : UINT32LE, the size of this MD chunk
  • 10 - 11 : UINT16LE, the width (in tiles) of this map
  • 12 - 13 : UINT16LE, the height (in tiles) of this map
  • 14 - 15 : UINT16LE, the number of floors in this map
  • 16 - 1F : Character array, refers to the used tileset (values can be CASTLE, AREA1, AREA2 etc. See MCG.VOL), rest zeroes
  • 20 - 3F : Palette data, 16 colors, each color stored as UINT16LE, colors are stored as 4 bit nibbles.
  • 40 - 5F : Character array, contains 2 names for the map, or for HOUSE types, the area that is referred. A long and a short version seperated by a pipe character.
Unknown data block

Unknown data block, usually ranging from 0060 to 0420. The data inside looks the same for all the MD chunks. It looks like a lot of 16 bit integers.

Tile data block

The size of the block is width * height * number of floors. Each byte corresponds to the tile number in the used tileset. The floors are stored sequentially.

Triggers block

The same layout as the tile data block, but now most bytes are 00 and the bytes that are not are triggers such as item pickups, stairs, doors etc. Unknown what the numbers mean, but exits are usually 0x20, item pickups range in 0x40 and character sprites are usually 0xC0 and higher.

Footer

At the end of each MD chunk there are 256 bytes, unknown what the data represents, but it looks like it is related to the trigger block.

PCM1.VOL - PCM6.VOL

These files contain PCM data in the form of the Creative Voice Format

Credits

This file format was reverse-engineered by BlackStar. If you find this information helpful in a project you're working on, please give credit where credit is due. (A link back to this wiki would be nice too!)