Inverse Frequency Sound format

From ModdingWiki
Jump to: navigation, search

The inverse frequency sound format is the format used by Apogee/id Software in many of their games for the PC sounds. It is named because the bulk of the data is stored as 'inverse frequency' values (the higher the value, the lower the tone it produces.) Most commonly this is stored in a file, though it may also be stored internally in the game executable.

Later games, such as Wolfenstein 3-D use a modified form of this for their (less important) PC speaker and adlib sounds.

File format

The file is divided into three sections, 16 bytes of header, a list of sound names and the actual sound data. The start of the sound data should thus be (count + 1) * 16 bytes from the file start.


Data type Name Description
char[4] signature "SND" + terminating null. Indicates the start of a sound file, whereas "SPK" + $00 indicates the start of the older 'speaker' format, which has very few differences.
UINT16LE size Size of file
UINT16LE unknown Usually 0x003C, but doesn't appear to do anything.
UINT16LE count Number of sounds. For SPK files this is blank as the number of sounds is always 63.
BYTE[6] pad Nulls to pad structure up to 16 bytes.

Sound definitions

This structure is repeated once per sound.

Data type Name Description
UINT16LE offset Offset of the sound from the beginning of the file.
UINT8 priority Whether or not sound will be interrupted by another sound if said sound starts playing while the first is. Sounds can only be interrupted by sounds that have an equal or higher value of this. 255 is max, 0 is inadvisable.
UINT8 flags If a sound is a 'placeholder' to be ignored, this is 1, if it is a proper sound it is 8.
char[12] name Null-padded sound name

Sound data

Sound data is divided into words (UINT16LE), with the word value being inversely proportional to the sound frequency. The sound frequency in Hz can be calculated as follows:

frequency = 1193181 / value

These word values are written directly to PIT Channel 2. The PC Speaker is updated at a rate of 140 Hz, so each word value is around 1/140th of a second of tone. Most files contain a few seconds of sound. The value $FFFF signals the end of a sound and most values are between the range $0100-$5000. $0000 is silence and will cause the PC Speaker to be turned off.

Later games use a similar format for PC sounds, where the sound data is in bytes with values of 0-255. Multiplying those byte values by 60 basically converts them to word values that can be written directly to PIT Channel 2. The loss of fine-tuning can probably be attrbuted to the lesser role of PC sounds in the AudioT Format, which has a totally different way of reading the sounds, and also contains AdLib sound effects and music.


Duke Nukem seems to ignore the sound priority stored in the files, and uses hard-coded priorities instead. ! This might also apply to the sound offsets, flags and the total number of sounds in the file.

This format has been reverse engineered many times, most often in the Commander Keen 1-3 fan community. Probably first by Anders Gavare.