AudioT Format

From ModdingWiki
Jump to navigation Jump to search
AudioT Format
Format typeArchive
Max filesUnlimited
File Allocation Table (FAT)External
Filenames?No
Metadata?None
Supports compression?Yes
Supports encryption?No
Supports subdirectories?No
Hidden data?No
Games

The AudioT Format is an archive file generated by a tool called Muse and used by many id/Apogee games to store Adlib music, Adlib sound effects and PC Speaker sound effects in a single file. The format was obviously designed to store digitized sounds as well, but apparently this was not implemented in Muse, so no game ever used it. The basic structure of this format is very similar to the EGAGraph Format and the GameMaps Format. The main audio file can be compressed (AUDIO.xxx) or uncompressed (AUDIOT.xxx). It requires an associated AUDIOHED.xxx which stores the offsets of each file, and an AUDIODCT.xxx if the file is compressed.

File formats

Audio Head (AUDIOHED.xxx)

This header file stores the offsets (relative to the start of the AUDIO/AUDIOT file) for each sub file. The format is trivial, in that the header file is simply an array of UINT32LE offsets.

The last offset in the file will be an offset to the end of the last 'used' chunk (in most cases the size of the main AUDIO/AUDIOT file). This was probably done so that the size of each chunk could be calculated with the formula below. Thus, the number of offsets (size of the AUDIOHED in dwords) minus one will give the number of chunks (slots) included in the main file. This number varies, but is usually about 200. (Games refer to the chunks by index, not by name).

The size of a chunk can be calculated like this:

size[i] = offset[i+1] - offset[i]

Empty chunks (size[i] = 0) should be skipped when extracting chunks.

Because of the !ID! tags that can be found in some AudioT files, some chunks may have a size of 4 or 8 bytes, or even a negative size. This is because the final offset in the AUDIOHED.xxx file points to the end of the last 'used' chunk. So if there are any tags at the end of the file, the final offset will point to the beginning of these teags, while the penultimate offset may point to the end of these tags. In any of these cases, the actual data in the AudioT file should be checked for the presence of a tag at the given offset.

In most (not all) games, the AUDIOHED is included in the executable file. See Extracting data files from executables for information on how to locate and extract them.

Dictionary (AUDIODCT.xxx)

If the main audio file is named AUDIO.xxx, then the file is compressed. The same Huffman Compression scheme used elsewhere in the Commander Keen series is used. When reading compressed files out of the AUDIO file, the first four bytes located at the AUDIOHED offset are a UINT32LE specifying the file's decompressed size (which is required for the decompression algorithm.) The compressed data then follows.

Note that the AUDIODCT file is not often obvious, usually being embedded in the main exe file. Executables usually contain two or more dictionaries, but the AUDIODCT is usually the first one. (A simple check of whether it decompresses the data sensibly works.)

The 'ID Engine' (a.k.a. the Wolfenstein 3-D source code) uses a single #define to indicate a compressed AudioT file as well as linked header and dictionary (see ID_CA.C in the Wolfenstein 3-D source and search for AUDIOHEADERLINKED). This means that any game using the compressed AUDIO.xxx file must have AUDIOHED.xxx and AUDIODCT.xxx linked in the executable. This also applies to Version 1.5 of Keen 6, which comes with an external AUDIOHED.CK6 that a) contains the wrong offsets and b) is not used by the game. On the other hand, every game using the uncompressed AUDIOT.xxx file must provide an external AUDIOHED.xxx.

While it would have been trivial to modify the ID Engine to allow any combination of compression and linked/external files, it seems that no game ever did. This means that no game ever had an external AUDIODCT.xxx file to begin with and since both IGRAB and TED5 were able to create .OBJ files to be linked in the executable and the #define mentioned above is called AUDIOHEADERLINKED, it is safe to assume that Muse (or whichever tool was used to compress the AudioT file) wrote only .OBJ files for the header and dictionary of the compressed AUDIO.xxx file. This could also explain why the AUDIOHED.CK6 from Keen 6 v1.5 contains the wrong offsets: The offsets in any AUDIOHED.xxx file only apply to the uncompressed AUDIOT.xxx file.

Main file (AUDIOT.xxx, AUDIO.xxx)

This file is simply an array of data files. Each file starts at the offset specified in the AUDIOHED file, and as there are no filenames each file is referred to by its index/slot number. Slots containing dummy values in the header are treated as if they don't exist in the AUDIO/AUDIOT file.

If compression is in use then the AUDIO file itself isn't compressed, rather each file within the AUDIO file is individually compressed. A file can be read by opening the AUDIO file, seeking to the offset specified in the AUDIOHED file, reading a UINT32LE for the decompressed file size, and then decompressing the data from that point onwards using standard Huffman decompression techniques.

If compression is not in use (AUDIOT files) then a file can be read by simply seeking to the offset specified in the AUDIOHED file and reading until the offset of the following file is reached. Since the last offset in the AUDIOHED file is the size of the AUDIOT file itself, no extra handling should be necessary to correctly read all of the last file.


Chunks

The AUDIOT format is a container format, meaning that any files the game can handle can be stored in it. Traditionally there have been four types of 'sub files' or chunks. These are (in order): PC Speaker sound effects, AdLib sound effects, digitized sound effects and Adlib music chunks (usually in IMF Format).

At least one case is known where the music is in MIDI format (Noah's Ark 3D) and it is possible that soundblaster 'digital' sound effects have been stored like this as well at some point. There has often been confusion about what part of a file are data and which are the AUDIOT file, for example the IMF file format was widely believed to exist in two types until it was discovered that 'type 1' IMFs were simply 'type 0' that had been incorrectly extracted from the AUDIOT file with their chunk length and tags.

It is notable that a sound file will contain exactly as many PC sounds as adlib and digitized sounds (and an arbitrary number of IMF files.) This is because most games have a PC/Adlib option and thus must match every PC sound with an adlib equivalent. The number of sounds is hard-coded. Muse generates a C header file to be used by the game. Here is an example from the Wolf3D source:

/////////////////////////////////////////////////
//
// MUSE Header for .WL6
// Created Tue Jul 14 15:04:53 1992
//
/////////////////////////////////////////////////

#define NUMSOUNDS		87
#define NUMSNDCHUNKS		288

//
// Sound names & indexes
//
typedef enum {
	HITWALLSND,              // 0
	SELECTWPNSND,            // 1
	/*...*/
	LASTSOUND
     } soundnames;

//
// Base offsets
//
#define STARTPCSOUNDS		0
#define STARTADLIBSOUNDS	87
#define STARTDIGISOUNDS	174
#define STARTMUSIC		261

//
// Music names & indexes
//
typedef enum {
	CORNER_MUS,              // 0
	DUNGEON_MUS,             // 1
	/*...*/
	LASTMUSIC
     } musicnames;

/////////////////////////////////////////////////
//
// Thanks for playing with MUSE!
//
/////////////////////////////////////////////////

In addition, Muse writes a special four byte tag (the string "!ID!") at the end of each section of chunk types. The "!ID!" tags themselves don't seem to be part of any chunk and are probably put in for debugging purposes. The same tag can be found in the GAMEMAPS.xxx and xGAGRAPH.xxx files, but the games do not seem to check these tags. In some cases (e.g. Commander Keen Dreams) the tags can be missing.

Sound chunks are always in the same order, PC, then adlib, then digitized then IMF. A simple method to classify the sound chunks is to check the AUDIOHED for chunks of size 0. These are unused digitized sound chunks and seperate adlib sound chunks from music chunks . Every nonzero length chunk after this is an IMF chunk (not counting the last digitized chunk, which may contain an "!ID!" tag). (IMFs are always the last chunks in the AUDIOHED and contain no zero length chunks) For the PC/adlib chunks, the number of chunks before the first zero length chunk can be divided into two, the first half being PC, the second half being adlib.

If the file is uncompressed then the adlib and IMF chunks may have additional 'tag' or footer information which was probaly used by Muse in one way or another. The compressed files were meant to be used only by the game, so there was no need to store the additional Muse data.) PC chunks are never tagged, except for the trailing zero byte. For adlib sounds the footer data are null terminated strings of 9 characters or less, including terminator. For IMF files these are 88 byte strings detailed on the IMF Format page. These names are not read by games.

Note:

The PC Sound and AdLib chunks can also be found as individual chunks or 'lumps' in Rise of the Triad's WAD file (between the labels 'pcstart'/'adstart' and 'pcstop'/'adstop').

PC Sounds

Data type Name Description
UINT32LE length Length of sound data (Usually chunk length - 7)
UINT16LE priority Sound priority (Only 1 sound may play at a time; any sound will interrupt a sound of equal or lower priority.)
BYTE[length] data Actual sound data
UINT8 terminator (Muse-only ?)

PC sounds use a modified form of the Inverse Frequency Sound format used in earlier ID games. Briefly, the sound data is stored as bytes, each specifying the inverse of a frequency. (Higher value, lower frequency.)

While most PC chunks have no sound name and thus can be identified solely by their chunk number and/or their first four bytes and their length, it appears they can in fact be named and some games occasionally do name them. These names can be found in the 'soundnames' enumeration type in the C header file generated by Muse, and probably in other Muse-related files that were never released to the public.

Multiplying each byte value by 60 gives 16-bit values that can be interpreted exactly like the values of the old Inverse Frequency Sound format. The value 0 still turns the speaker off, but there is no way to signal the end of a sound with the value 0xFFFF. The length value is used to determine the end of the sound.

Code: Playback

The sound data is processed at a rate of about 140 Hz, which means that 140 byte values are played per second. The playback routines that are actually used by Wolfenstein 3-D are in pure assembly, but the following C code (also taken from Wolf3D) demonstrates the playback well enough:

//	PC Sound variables
		volatile byte	pcLastSample,far *pcSound;
		longword	pcLengthLeft;
		word		pcSoundLookup[255];

/*...*/

static void
SDL_PCService(void)
{
	byte	s;
	word	t;

	if (pcSound)
	{
		s = *pcSound++;
		if (s != pcLastSample)
		{
		asm	pushf		//
		asm	cli		// disable interrupts

			pcLastSample = s;
			if (s)					// We have a frequency!
			{
				t = pcSoundLookup[s];
			asm	mov	bx,[t]

			asm	mov	al,0xb6			// Write to channel 2 (speaker) timer
			asm	out	43h,al
			asm	mov	al,bl
			asm	out	42h,al			// Low byte
			asm	mov	al,bh
			asm	out	42h,al			// High byte

			asm	in	al,0x61			// Turn the speaker & gate on
			asm	or	al,3
			asm	out	0x61,al
			}
			else					// Time for some silence
			{
			asm	in	al,0x61		  	// Turn the speaker & gate off
			asm	and	al,0xfc			// ~3
			asm	out	0x61,al
			}

		asm	popf		//enable interrupts
		}

		if (!(--pcLengthLeft))
		{
			SDL_PCStopSound();
			SDL_SoundFinished();
		}
	}
}

SDL_PCService() would be called 140 times per second. The pcSoundLookup table simply multiplies the byte value by 60. When a sound is started, the pcSound pointer points to the data of the sound chunk, pcLengthLeft is set to the length value of the sound chunk and pcLastSample is set to 0 (silence). SDL_PCStopSound() simply turns the speaker and gate off and makes pcSound a NULL pointer.

Note:

The Apogee Sound System, written by Jim Dosé, provides code to play both the old 16-bit PC Speaker sound effects and the 8-bit sound effects used by the AudioT format. This might be interesting to look at since the code doesn't use that much assembly which makes it a lot easier to understand. The source code for the ASS was released along with the source code of Rise of the Triad and Duke Nukem 3D (and maybe Shadow Warrior). The code for the PC Speaker routines is in PCFX.C/PCFX.H in the audiolib folder/archive.

Code: Converting to Wave format

The PC sound chunks can be converted into Wave files easily enough, but currently the only program known to do so is Keenwave: http://levellord.toxicsheep.com/KWbeta.zip

Here is a very simple example that converts a PC sound to PCM wave data:

#define BYTE unsigned char
#define UINT unsigned int  //should be a 32-bit int!

#define PC_BASE_TIMER 1193181
#define PC_VOLUME 20	//not larger than 127!
#define PC_RATE 140	//144 is closer to the DOSBox output
void convert(BYTE *src, UINT src_length, BYTE *dst, UINT hertz)
{
/*
  Notes: dst must be able to hold at least src_length*(hertz/PC_RATE) bytes
         src_length must not be 0
         src and dst must not be null pointers
*/
	signed int
		sign = -1;
	UINT tone,
		i,
		phase_length,
		phase_tic = 0,
		samples_per_byte = hertz/PC_RATE;
	
	while (src_length--)
	{
		tone = *src++ * 60;
		phase_length = (hertz*tone)/(2*PC_BASE_TIMER);
		for (i=0; i<samples_per_byte; i++)
		{
			if (tone)
			{
				*dst++ = 128 + sign*PC_VOLUME;
				if (phase_tic++ >= phase_length)
				{
					sign = -sign;
					phase_tic = 0;
				}
			} else {
				phase_tic = 0;
				*dst++ = 128;
			}
		}
	}
}

This code converts the data of an AudioT chunk to unsigned 8-bit Mono PCM data of the frequency given in the hertz parameter. It generates simple square waves and the resulting wave sound is close enough to the sound played by the PC Speaker, so it could be used as a preview in an editor. The generated wave sound should use a sample rate of at least 40000 Hz or some precision might be lost as the 256 possible values of tone are mapped to less than 256 possible values of phase_length.

Adlib Sounds

Adlib sounds are slightly more complex. Their header contains additional information to the PC sounds, used to set up the Adlib chip when playing the sound. Games always reset the chip when playing a new sound and so each sound is given 17 bytes of Adlib data. This information is what, in effect, makes the Adlib sounds sound different from the PC ones. by changing these values, various properties of the sound can be altered. Specifically these refer to the values for each of two operators for a single Adlib channel. (Adlib sounds use two sound waves, a carrier and a modulator, C1 and C2, the modulator moves through the carrier, modifying it as it does so.) Adlib sound effects can be converted into IMF format as IMFs are essentially sequences of Adlib sounds on several instead of just one Adlib channel.

An Adlib sound entry is contained in this structure:

Data type Name Description
UINT32LE length Length of sound data
UINT16LE priority Sound priority (Only 1 sound may play at a time; any sound will interrupt a sound of equal or lower priority.)
BYTE[16] instrument Instrument settings
BYTE block Octave number
BYTE[length] data Actual sound data
UINT8 terminator (Muse-only ?)
ASCIIZ[] name NULL-terminated Instrument name (optional, Muse-only)

The format of the instrument, block and data fields is described on the Adlib sound effect page.

Digitized Sounds

No game ever used the chunks for digitized sounds and it can be deduced from the glossary of the "Doom Bible" that Muse did not support digitized sounds at all. So there is no need to document the structure of the digitized sounds or even implement support for them in a tool.

Inserting digitized sounds into Duke Nukem II's AudioT file may cause the game to crash or freeze, but that's actually because that game uses a fixed-size buffer to load the AudioT file, so using a longer file causes memory corruption!

Anyway, this is the structure of a digitized sound chunk (taken from the Wolfenstein 3-D source code):

typedef struct
	{
		longword	length;
		word		priority;
	} SoundCommon;

typedef struct
	{
		SoundCommon	common;
		word		hertz;
		byte		bits,
				reference,
				data[1];
	} SampledSound;

Note:

Even though the Wolf3D source declares the data structure, it is never actually used in Wolf3D and there is no code to play these digitized samples. The digitized samples used by Wolf3D are stored as raw PCM data in VSWAP.xxx and are played at a fixed sampling rate of about 7000 Hz.

The Commander Keen Dreams source contains code to play the digitized sounds from an AudioT file, but the routines are never used in that game.

IMF Music

IMF Format files are packed at the end of the audio file. Occasionally 'empty' music files are found (e.g Wolfenstein 3-D) that contain no actual data, just footer information placed there by Muse. They can be extracted as they are and played with various utilities such as Adplug.

Utilities

KeenWave: This program can extract, and to a limited extent, edit the AUDIOT files into various formats, including wave files. It is currently at the beta stage. http://levellord.toxicsheep.com/KWbeta.zip It also contains:

KeenISF: This program plays the Adlib sounds, if they are extracted as raw data.