Skip to content

Audio Files

This page covers the audio files found within the Alarmo's filesystem.

File Formats

The Alarmo's sound format consists of 2 different filetypes:

  • SHAAShiro Audio Asset, which are the primary audio files used by the Alarmo.

  • SHSAShiro Sound Asset, which are largely the same as .shaa files.

The main difference between SHAA and SHSA files lie in how they're used.

  • If a filename only has a .shaa extension, then all of the audio data will be contained within that file.

  • If there are 2 files with the same base filename but with one ending in .shaa and one ending in .shsa, then all of the audio data will be contained within the .shsa file, while the .shaa file will only contain a copy of the header.

File Header (ShiroADPCM)

SHAA and SHSA files share a common header format, which is like a DSPADPCM header but with a few additions.

The actual header itself is comprised of 3 parts: Info, Name, and DSP.

Info

This section defines the data for the track itself, such as the sample rate, number of channels, and other information.

struct ShiroADPCM_Info
{
    // General data
    uint8_t     magic[4];           // "SHAA"
    uint32_t    codec;              // 2=DSPADPCM
    uint32_t    header_size;        // Total size of the header in bytes
    uint32_t    res_C;              // Reserved

    // Flags
    uint8_t     format;             // 0=PCM8, 1=PCM16, 2=ADPCM
    uint8_t     type;               // 1=SHAA, 2=SHAA+SHSA
    uint8_t     channel_count;      // Number of channels
    uint8_t     res_13;             // Reserved

    // Track information
    uint32_t    sample_rate;        // Sample rate in Hz
    uint32_t    length;             // Length of the audio track in samples
    uint32_t    dsp_offset;         // Offset of the DSP section in bytes
    uint32_t    unk_20;             // Unknown (60 for most, 00 for test sound assets)
    uint32_t    loop_start;         // Loop start offset in samples (0 if no loop)
    uint32_t    loop_end;           // Loop end offset in samples (0 if no loop)
};

Name

This section defines an optional name for the track, which is a variable-sized structure depending on the length of the string.

If the value of the length field is 0, then this section should only be 4 bytes long.

struct ShiroADPCM_Name
{
    uint32_t    length;             // Length of the audio name
    char        name[];             // Name of the audio track
};

DSP

This section contains the information for the DSPADPCM decoder.

struct ShiroADPCM_DSP
{
    // Decoder header
    uint32_t    num_samples;        // Number of samples
    uint32_t    num_adpcm_nibbles;  // Number of ADPCM nibbles
    uint32_t    sample_rate;        // Sample rate in Hz

    // Decoder context and addresses
    uint16_t    loop_flag;          // Whether the sample is looped (1=true, 0=false)
    uint16_t    format;             // Sample data format (always 0 for ADPCM)
    uint32_t    sa;                 // ADPCM start loop address (always 2)
    uint32_t    ea;                 // ADPCM end loop address 
    uint32_t    ca;                 // Initial offset value (always 2)
    uint16_t    coef[16];           // Decode coefficients (8 pairs of 16-bit words)

    // Initial decoder state
    uint16_t    gain;               // Always zero for ADPCM
    uint16_t    ps;                 // Predictor/Scale
    uint16_t    yn1;                // Sample history 1
    uint16_t    yn2;                // Sample history 2

    // Loop context
    uint16_t    lps;                // Predictor/Scale for loop context
    uint16_t    lyn1;               // Sample history (n-1) for loop context
    uint16_t    lyn2;               // Sample history (n-2) for loop context
    uint16_t    pad[11];            // Reserved
};

(More Coming Soon)

Work in progress

This page is currently under development.

Feel free to follow @KernelEquinox to get notified about site and documentation updates as they happen.