Okay, despite my promises of regular updates I’ve fallen a little behind, and you can thank a short holiday, music festivals, my Real Job™, and many many videogames for that.
Welcome to the first article in the Sound series! Being a full-time audio engine programmer, game audio is a subject very close to me, so this should get quite tasty. One of the most nostalgic and fondly remembered parts of my childhood were the soundtracks accompanying my favourite games, many of which I still find technically impressive, especially considering the machine’s audio capabilities didn’t really weigh up to those of its competing consoles, requiring the sound designers to squeeze so much out of so little.
There are three chips which can create, or are involved in the creation of sound in the Megadrive:
The PSG chip
The Programmable Sound Generator, the same Texas Instruments SN76489 chip used in the Sega Master System. It’s a simple device, comprising 4 mono channels – 3 channels generate a square wave at varying frequencies, and the 4th is a white noise generator. Each channel can have its attenuation modified, and the frequency of the wave or density of the white noise.
It’s got a good trick up its sleeve – you can set the wave generators to stick at +ve, then switch the attenuation registers high and low very quickly to make it behave like a 1-bit DAC, meaning it’s capable of playing PCM data. It won’t win any sound quality awards, and with the available memory I doubt it could stream entire soundtracks, but it can pull of some short voice overs and sound effects which can’t be synthesised.
The FM chip
The Yamaha YM2612 – a stereo, 6 channel sound chip with lots to brag about. 6 FM voices with 4 operators each, Yamaha synthesizer patch compatibility, envelopes, a distortion oscillator, two built-in timers, and a DAC for playing 8-bit PCM data.
Each ‘operator’ of a channel has an input and output (from and to other operators, or the final output), a frequency and an envelope. These operators can be routed in various formations to create algorithms suited for different instrument types (harp, strings, flute, brass, xylophone, piano, organ, guitar and various percussion). It’s quite a complex beast and requires some serious effort to get anything other than basic beeps out of it, but I’ve found plenty of documentation and examples so hopefully I can get it singing.
The Z80 chip
The 8-bit Zilog Z80, which was the Sega Master System’s main CPU, made its way to the Genesis both for backwards compatibility and as a slave CPU to the 68000. Neither of the sound chips can be programmed and left to get on with it, they need their hands holding for the whole session, having data fed to them constantly. This would be quite a pain to manage on the 68k, and it would be impossible to keep up a feed of PCM data whilst trying to process an entire game at the same time, so the Z80 can be used as a secondary CPU for managing the audio. Unfortunately, it doesn’t have any knowledge of the sound chips or how to operate them by default – we need to write our own sound driver (a sequencer, and a program to feed PCM data) in Z80 assembler, generate a binary and load it on startup. This means learning another assembly language, but I’m certainly up for the challenge.
Programming the PSG
I won’t jump straight in the deep end and write a Z80 audio driver just yet, I need to get a handle on the basics. This article covers the simplest case – operating the PSG from the 68000.
The PSG has 4 channels – 3 square wave generators and 1 white noise generator. I’ll concentrate on just the wave generators to begin with. Each channel is controlled by 2 registers: one for the attenuation, and one for the wave generator’s counter reset. The attenuation registers are 4 bits in size, allowing 16 possible volume values, from 0x0 (no attenuation; full volume) to 0xF (full attenuation; no volume). The counter reset registers are 10 bits in size, and store the square wave’s time until the polarity of the output is flipped in clock ticks / 16 (essentially the “wave width / 2”).
So, assuming an NTSC setup with a clock frequency of 3579545 Hz, a register value of 0xFE would generate a square wave at a frequency of 440.4 Hz (3579545 ÷ (2 x 16 x reg value)).
To program the PSG, we write to its control port at 0x00C00011, one byte at a time. The most significant bit is the latch, telling the chip that this is the first or only byte it’s expected to receive. Bits 6-5 mark the channel ID (0 – 3) that we’re about to modify. Bit 4 indicates that we’re writing either the attenuation value or wave/noise settings. That leaves 4 bits for the data, which may or may not be enough. If there wasn’t enough space, we send a second byte with the latch bit OFF, indicating it contains the remaining data and is not a new command, with the upper 6 bits of data in bits 5-0.
So, a quick recap:
Bit 7 : Latch. ON indicates this is the first (or only) byte being written
Bits 6-5 : Channel ID (0-3)
Bit 4 : Data type. ON if data bits contain attenuation value, OFF if they contain the square wave counter reset
Bits 3-0 : The data. Either all 4 bits of the attenuation value, or the lower 4 bits of counter reset value
Bit 7 : Latch. OFF indicates this is the second byte, and will only contain the remainder of data
Bit 6 : Unused
Bits 5-0 : Upper 6 bits of data
So, let’s make channel 0 produce 440.4hz. The first byte needs the latch ON (to indicate it’s the first byte), the channel ID, the data type bit OFF (to indicate we’re writing a counter reset value), and the lower 4 bits of the value 254. The second byte needs the latch OFF (it’s the second byte), and the upper 6 bits of the value 254.
They’re written to the PSG control port at address 0x00C00011:
move.b #%10001110, 0x00C00011 ; Latch ON, channel 0, counter data type, lower 4 bits of data
move.b #%00001111, 0x00C00011 ; Latch OFF, upper 6 bits of data
The channel will have been initialised fully attenuated, so we need to turn the volume up to hear anything. The attenuation is specified in 4 bits, where 0 is fully attenuated and 16 is full volume, so we can fit the command and data in a single byte. Latch needs to be ON, channel ID is 0, data type bit ON to indicate attenuation data, followed by the 4-bit value:
move.b #%10010000, 0x00C00011 ; Latch OFF, channel 0, attenuation data type, 4 bits of data
Immediately after writing, the PSG will emit a constant tone of 440.4hz, at full volume.
A poor man’s sequencer
Even with just one feature of the PSG covered, it’s enough to make a basic tune. By defining an array of counter reset values, with sustain times, we can iterate over them and play the notes. Here’s some example data – it’s one crudely written line from Scott Joplin’s The Entertainer. Each “note” is 32 bits in size, the first word is the sustain time in vsync frames, and the second word is the counter reset value:
dc.w 0x0010, 0x02f8, 0x0010, 0x02cd, 0x0010, 0x02a5, 0x0010, 0x01aa, 0x0008, 0x0000 ; D3 D#3 E3 C4 .
dc.w 0x0010, 0x02a5, 0x0010, 0x01aa, 0x0008, 0x0000 ; E3 C4 .
dc.w 0x0010, 0x02a5, 0x0010, 0x01aa, 0x0010, 0x0000 ; E3 C4 .
dc.w 0x0010, 0x01aa, 0x0010, 0x0193, 0x0010, 0x017c, 0x0010, 0x0152, 0x0010, 0x01aa, 0x0010, 0x017c, 0x0010, 0x0152, 0x0008, 0x0000 ; C4 C#4 D4 E4 C4 D4 E4 .
dc.w 0x0010, 0x01c4, 0x0010, 0x017c, 0x0008, 0x0000 ; B3 D4 .
dc.w 0x0010, 0x01aa ; C4
chan0_notes_len equ chan0_notes_end-chan0_notes
chan0_notes_count equ chan0_notes_len/4
In order to play it, we loop over the notes, apply the counter reset value to channel 0, and wait for the defined amount of frames. It’s pretty simple:
move.b #%10010000, psg_control ; Channel 0 full volume
lea chan0_notes, a0 ; Notes to a0
move.l #chan0_notes_count, d1 ; Number of notes to d1
subi.l #0x1, d1 ; -1 for counter
move.w (a0)+, d0 ; Delay to d0 and inc. pointer
move.w (a0)+, d2 ; Counter reset to d2 and inc. pointer
move.b d2, d3 ; Lower byte of counter reset to d3
and.b #%00001111, d3 ; Clear top nybble (leave lower 4 bits)
or.b #%10000000, d3 ; Latch bit (7) on, chan 0 (6-5), tone data bit (4) off
move.b d3, psg_control ; Write to PSG port
move.w d2, d3 ; Counter reset to d3 again
ror.w #0x4, d3 ; Shift right 4 bits
and.b #%00111111, d3 ; Only need bits 5-0 (upper 6 bits of the 10 bit value)
move.b d3, psg_control ; Write to PSG port
move.l d1, -(sp) ; Backup d1
jsr WaitFrames ; Delay frames is in d0
move.l (sp)+, d1 ; Restore d1
dbra d1, @NextNote ; Branch back up to play the next note
move.b #%10011111, psg_control ; Finished - silence channel 0
It’s a good start, but it will need many improvements. At the very least, it should support all three square wave channels, and a higher frequency timer. Also, each note should come with a “note on” time, so we don’t need to create “empty” notes for silence, and instead have arbitrary start times. To make it more advanced, we could implement software ADSR envelopes to bring some life to each note with a softer attack and a fade out.
I’ll be building on my sequencer with each new sound feature I learn, and I have plans to write an authoring tool to compose music on a PC, instead of looking up the frequency of each note, converting it to a counter reset value and manually typing it into a text editor! The sequencer will eventually need translating to Z80 assembly language, too.
Well, this was quite a short article, I’ll try and pick up the pace again soon. I’ve missed out the white noise generator, but to really show it off I need to add some finely tuned support to my sequencer for it, so perhaps it deserves – along with many sequencer improvements – a post of its own.
asm68k.exe /p soundtest.asm,soundtest.bin