Sega Megadrive – 3: Awaking the Beast

This bit was difficult. When the Megadrive is turned on, you get a blank slate. Nothing is initialised for you – the RAM is full of garbage, the controller ports are dead, and the VDP is cold, alone and scared – you have to restore some sanity and set each piece up one by one. What makes it even more difficult, is that you get no visual feedback that it’s been done correctly until you’ve set up enough things to start displaying something on screen – and that takes a LOT of code.

I’ve found various tutorials and code samples showing how to initialise the Megadrive, to the point where we can begin doing some VDP work and get a few pixels showing. Unfortunately they were a little complex for me, I lost some hair trying to get it to work with my chosen assembler, a lot of things were left unexplained, and I’ve had to do some research to fill in the gaps. Now that I know how each step works I’ve since rewritten the code, breaking things down into smaller steps and commenting every line. Here’s each step explained:

1. Checking the Reset Button

The first thing to figure out is if we need to do anything at all. If the player pressed the reset button, then everything will already have been setup and we can just jump straight to the action again. From all the sample code I’ve seen, two separate reset indicators are checked – one is the physical button on the console, but I can’t find any information about the other one. Perhaps it has something to do with the expansion port, so that future addon hardware (the MegaCD or the 32X) can trigger a software reset. Anyway, here’s how to check:

EntryPoint:          ; Entry point address set in ROM header
   tst.w 0x00A10008  ; Test mystery reset (expansion port reset?)
   bne Main          ; Branch if Not Equal (to zero) - to Main
   tst.w 0x00A1000C  ; Test reset button
   bne Main          ; Branch if Not Equal (to zero) - to Main

If the results of the test are non-zero, then a soft reset has occurred and we can branch straight to Main, skipping all of this initialisation.

We test two addresses – they’re not addresses in main memory, but mapped to some specific hardware ports. Addresses starting from 0x00A00000 are not those of main RAM, but are the system I/O areas, which point to various ports or the memory of other coprocessors within the Megadrive. Most of the system I/O addresses can be found in a technical manual straight from Sega themselves, which can be found in the references at the bottom of this post.

2. Clearing the RAM

When the system is powered up, the RAM could be in any old state. Most good emulators clear it when loading a ROM, but this isn’t going to be of much help when I finally get hold of some development hardware and start scratching my head at the garbled mess on screen. We know the Megadrive’s RAM is 64kb in size, and technically we know where its address mappings begin and end since we’ve defined that in the ROM header, but it seems to be common practise to rely on the machine’s ability to wrap around the end of the physical addresses back to the beginning, and clear it from 0x00000000 backwards.

If we put 0x00000000 into an address register, and then use pre-decrement when writing a zero to that address, we’ll wrap around to the end of memory and clear the last byte:

move.l #0x00000000, d0     ; Place a 0 into d0, ready to copy to each longword of RAM
move.l #0x00000000, a0     ; Starting from address 0x0, clearing backwards
move.l #0x00003FFF, d1     ; Clearing 64k's worth of longwords (minus 1, for the loop to be correct)
@Clear:
move.l d0, -(a0)           ; Decrement the address by 1 longword, before moving the zero from d0 to it
dbra d1, @Clear            ; Decrement d0, repeat until depleted

I’ve purposely written a whole longword to the d1 register, where just a word-sized MOVE would suffice for the byte count 0x3FFF. This is because I have no idea if the registers will have been cleared or not when the system was powered on. Better safe than crashy.

3. Writing the TMSS

The Trade Mark Security Signature – or TMSS – was a feature put in by Sega to combat unlicensed developers from releasing games for their system, which is a kind of killswitch for the VDP. It’s the pinnacle of security systems, a very sophisticated encryption key which is almost uncrackable. You write the string “SEGA” to 0x00A14000.

This was only implemented in the second hardware version of the Megadrive, so we need to test the system’s version number at mapped I/O address 0x00A10001 before proceeding. This points to a byte of read-only memory, possibly on another chip, which stores the version ID (bits 0-3), CPU clock/region (bit 6 on = 7.60mhz PAL, off = 7.67mhz NTSC), and domestic/overseas model (bit 7). We only need to test the bottom four bits (one nybble):

   move.b 0x00A10001, d0      ; Move Megadrive hardware version to d0
   andi.b #0x0F, d0           ; The version is stored in last four bits, so mask it with 0F
   beq @Skip                  ; If version is equal to 0, skip TMSS signature
   move.l #'SEGA', 0x00A14000 ; Move the string "SEGA" to 0xA14000
   @Skip:

I’m unsure at what point the signature is checked and VDP killswitch activated, whether it’s by time or the first VDP command is sent. Either way, the VDP is now safe. There’s also a new opcode there – ANDI (immediate logic AND), which ANDs two values, storing the result in d0.

4. Initialising the Z80

Next, we can begin initialising each of the Megadrive’s coprocessors, starting with the Zilog Z80. The Z80 is the same 8-bit chip used in the Sega Master System, and in the Megadrive it acts as both a controller for the PSG and FM sound chips, and a backwards compatibility processor for playing Master System games (with an appropriate adapter for the cartridge). The Z80 has its own set of registers, and various command and data ports for sending it instructions and information, as do the other coprocessors. It also has 8kb of RAM to itself. To send it commands, or some data, we can simply MOVE values to mapped I/O addresses.

The Z80 needs a few things doing – first, we need to request access to its bus, so that it can listen to us. We request – or release – control of the bus by writing 0x0100 or 0 to its BUSREQ port, and then wait in a loop until we have control, by reading this same port. We also need to stop it running by holding it in a reset state – again by writing a 1 to one of its ports. Whilst we’re holding it in this state, we can freely write a program to its RAM. Finally, we release control of the bus and let go of the reset state, and it can then be left alone to act on the data.

   move.w #0x0100, 0x00A11100 ; Request access to the Z80 bus, by writing 0x0100 into the BUSREQ port
   move.w #0x0100, 0x00A11200 ; Hold the Z80 in a reset state, by writing 0x0100 into the RESET port

   @Wait:
   btst #0x0, 0x00A11100   ; Test bit 0 of A11100 to see if the 68k has access to the Z80 bus yet
   bne @Wait               ; If we don't yet have control, branch back up to Wait

Here’s a new opcode, BTST (bit test). It does the same as TST, but only compares the least significant bits.

Now the 68000 has access to the Z80’s bus, and the chip is held in a reset state, so we can write the program data to its memory. This is mapped from 0xA000000.

   move.l #Z80Data, a0      ; Load address of data into a0
   move.l #0x00A00000, a1   ; Copy Z80 RAM address to a1
   move.l #0x29, d0         ; 42 bytes of init data (minus 1 for counter)
   @Copy:
   move.b (a0)+, (a1)+      ; Copy data, and increment the source/dest addresses
   dbra d0, @Copy

   move.w #0x0000, 0x00A11200 ; Release reset state
   move.w #0x0000, 0x00A11100 ; Release control of bus

Now the chip starts running again, and begins executing the program written to its memory. I keep glossing over this ‘program’ since I don’t yet have any clue as to what it does! I’ll get some documentation and dissect it bit by bit once I start doing some audio work.

Z80Data:
   dc.w 0xaf01, 0xd91f
   dc.w 0x1127, 0x0021
   dc.w 0x2600, 0xf977
   dc.w 0xedb0, 0xdde1
   dc.w 0xfde1, 0xed47
   dc.w 0xed4f, 0xd1e1
   dc.w 0xf108, 0xd9c1
   dc.w 0xd1e1, 0xf1f9
   dc.w 0xf3ed, 0x5636
   dc.w 0xe9e9, 0x8104
   dc.w 0x8f01

5. Initialising the PSG

This one is the Programmable Sound Generator. It can generate square waves and white noise for procedurally creating sounds. As with the Z80 program, I have no idea what the sample data does yet, I’ll look into it at a later date. Copying data to the PSG is a lot simpler than the Z80, since we can just write the data straight to its RAM through an I/O address without requesting bus access:

   move.l #PSGData, a0      ; Load address of PSG data into a0
   move.l #0x03, d0         ; 4 bytes of data
   @Copy:
   move.b (a0)+, 0x00C00011 ; Copy data to PSG RAM
   dbra d0, @Copy

PSGData:
   dc.w 0x9fbf, 0xdfff

6. Initialising the VDP

The VDP – or Visual Display Processor – is the most complex of the coprocessors. It’s a dedicated graphics chip for displaying sprites and patterns, and warrants its own chapter, which I’ll write up in the next post – getting something on screen.

The VDP has its own set of registers (24 of them), as well as 64kb of dedicated RAM. Communication with the VDP is via two ports – the control port and the data port, which are I/O addresses mapped to 0x00C00004 and 0x00C00000 respectively. The control port is used for setting registers, and supplying a VDP RAM address ready to send data through the data port. The VDP can only send and receive data in bytes or words, but we can make use of a feature which automatically increments the destination address for us, and it will treat a longword write as two separate word writes. More about this feature in the next post.

Each of the VDP’s registers are used to set its various graphics modes, plane addresses and scrolling settings, amongst other things. We initialise the VDP by setting all of these registers, using a word-size command sent to the control port:

  • The top nybble is the command – 0x8XXX means set register value
  • The next nybble is the register number – so 0x80XX = set register 0, 0x81XX = set register 1, etc
  • The bottom byte is the data – so 0x82FF writes FF into register 2

To make things easier, we just keep one big table of all of the VDP’s register values, and copy the whole lot in one go:

 move.l #VDPRegisters, a0 ; Load address of register table into a0
 move.l #0x18, d0         ; 24 registers to write
 move.l #0x00008000, d1   ; 'Set register 0' command (and clear the rest of d1 ready)

@Copy:
 move.b (a0)+, d1         ; Move register value to lower byte of d1
 move.w d1, 0x00C00004    ; Write command and value to VDP control port
 add.w #0x0100, d1        ; Increment register #
 dbra d0, @Copy

Explanations (albeit short explanations) of the VDP registers can be found in chapter 4 of the SEGA2 doc (I’ve added a link to an HTML version in the references). Below is the minimum of things enabled to get started, but these registers will be revisited quite often as I work with more graphics features.

VDPRegisters:
   dc.b 0x20 ; 0: Horiz. interrupt on, plus bit 2 (unknown, but docs say it needs to be on)
   dc.b 0x74 ; 1: Vert. interrupt on, display on, DMA on, V28 mode (28 cells vertically), + bit 2
   dc.b 0x30 ; 2: Pattern table for Scroll Plane A at 0xC000 (bits 3-5)
   dc.b 0x40 ; 3: Pattern table for Window Plane at 0x10000 (bits 1-5)
   dc.b 0x05 ; 4: Pattern table for Scroll Plane B at 0xA000 (bits 0-2)
   dc.b 0x70 ; 5: Sprite table at 0xE000 (bits 0-6)
   dc.b 0x00 ; 6: Unused
   dc.b 0x00 ; 7: Background colour - bits 0-3 = colour, bits 4-5 = palette
   dc.b 0x00 ; 8: Unused
   dc.b 0x00 ; 9: Unused
   dc.b 0x00 ; 10: Frequency of Horiz. interrupt in Rasters (number of lines travelled by the beam)
   dc.b 0x08 ; 11: External interrupts on, V/H scrolling on
   dc.b 0x81 ; 12: Shadows and highlights off, interlace off, H40 mode (40 cells horizontally)
   dc.b 0x34 ; 13: Horiz. scroll table at 0xD000 (bits 0-5)
   dc.b 0x00 ; 14: Unused
   dc.b 0x00 ; 15: Autoincrement off
   dc.b 0x01 ; 16: Vert. scroll 32, Horiz. scroll 64
   dc.b 0x00 ; 17: Window Plane X pos 0 left (pos in bits 0-4, left/right in bit 7)
   dc.b 0x00 ; 18: Window Plane Y pos 0 up (pos in bits 0-4, up/down in bit 7)
   dc.b 0x00 ; 19: DMA length lo byte
   dc.b 0x00 ; 20: DMA length hi byte
   dc.b 0x00 ; 21: DMA source address lo byte
   dc.b 0x00 ; 22: DMA source address mid byte
   dc.b 0x00 ; 23: DMA source address hi byte, memory-to-VRAM mode (bits 6-7)

7. Initialising the Controller Ports

The controller ports are generic 9-pin I/O ports, and are not particularly tailored to any device. They have five mapped I/O address each – CTRL, DATATX, RX and S-CTRL:

  • CTRL controls the I/O direction and enables/disables interrupts generated by the port
  • DATA is used to send/receive data to or from the port (in bytes or words) when the port is in parallel mode
  • TX and RX are used to send/receive data in serial mode
  • S-CTRL is used to get/set the port’s current status, baud rate and serial/parallel mode.

The SEGA2 doc mentions three controller ports – Controller 1, Controller 2, and EXP. I’m guessing EXP is the 9-pin expansion port on the back of the version 1 Genesis, perhaps intended for basic non-joypad peripherals that didn’t require the full expansion port on the bottom of the unit.

   ; Set IN I/O direction, interrupts off, on all ports
   move.b #0x00, 0x000A10009 ; Controller port 1 CTRL
   move.b #0x00, 0x000A1000B ; Controller port 2 CTRL
   move.b #0x00, 0x000A1000D ; EXP port CTRL

8. Clearing the Registers and Tidying Up

Now everything should be initialised ready for some real work, but it would be best if the actual game code could start with a clean slate. Some rubbish is still in the registers, so let’s clear it:

   move.l #0x00000000, a0    ; Move 0x0 to a0
   movem.l (a0), d0-d7/a1-a7 ; Multiple move 0 to all registers

Here’s a very useful opcode – MOVEM (move multiple). It can move data to/from a list of registers or register ranges, for example d0,d3,d5 or a3-a5. A common use for it would be to backup/restore all of the registers to/from the stack, in a single instruction.

Next, the status register. The only thing I currently understand about the status register is that certain opcodes can leave the results of an operation in it, like a return value in C/C++. After some reading, it turns out that it can also store the stack pointer register used for interrupts (so that the JMP to an interrupt routine doesn’t trample over the real stack), enable or disable interrupts, and to enable or disable tracing (calls a routine after every opcode, useful for storing callstacks for an exception handler).

   ; Init status register (no trace, A7 is Interrupt Stack Pointer, no interrupts, clear condition code bits)
   move #0x2700, sr

And that’s it! The system is initialised, albeit in a very minimal state, ready to do some work. I’ll come back and amend the init code later if I need more functionality out of the machine. Now to jump to the main game code, which I’ve labelled as __main in a separate ASM file. I’ve also labelled the JMP itself as Main, so that we branch here if the reset button has been pressed and the initialisation is skipped:

Main:
   jmp __main ; Jump to the game code!

Matt.

Source

References

Sega Megadrive – 2: So, assembly language, then…

I’ve been toying with the idea of learning an assembly language for some considerable time. I tried – and failed – to get to grips with 68k ASM on the Atari STe, but that was mostly not being able to figure out how to get the DevPac IDE to stop crashing. Perhaps I had a bad disk, or not enough RAM in my STe (I think it was the measly 512k model). I’ve since given 68k a second shot, on the Megadrive, and this stuff is finally beginning to sink in. This post shows the things I’ve learned so far, some of the troubles I ran into, and some of things I still find confusing.

I’ve already got 10 or so years (three of those professionally) of C and C++ programming under my belt, so I’ve had a good head start, and I’m hopeful that this won’t to be too tricky to learn. I’m already familiar with some of the more advanced concepts of programming, such as working with raw bytes, bitwise operations, address alignment, and the best types of coffee to buy to make coding sessions more productive. So, here goes…

68k Assembly – The Basics

One line of 68k assembly code equals one CPU instruction (called an opcode) plus its parameters, so it’s an almost bare-metal experience working directly with the hardware. It’s one step up from working with machine code directly. Therefore, the programs used to create binaries only assemble the code into CPU instructions, there’s no real compiling involved. Fortunately, that means assembling is really fast, and you know exactly what you’re getting. Unfortunately, that means you have to do all of the hard work yourself, there’s limited language ‘features’ to help out – functions, enums, classes and structs, templates – just forget about them.

The purpose of most opcodes is to perform an operation on one or more bytes of data. This could be to move bytes from one location to another, or perform some arithmetic on them. The CPU is incapable of performing most tasks on the data whilst it is in main RAM, instead it has its own localised storage spaces (physically on the chip) where data is temporarily stored so it can be manipulated. These spaces are called registers, and the 68000 has 16 of them. 8 of them are general purpose registers – this is where the majority of arithmetic work will be done. Each general purpose register is 32 bits in size. The other 8 are address registers, and are only used for storing addresses of main memory for fetching or returning data from it, so they’re basically pointers that are attached to the CPU.

The general purpose registers have names d0 – d7, and the address registers a0 – a7. So, the fourth general purpose register is called d3, and the second address register is called a1. Some registers have aliases for ease of use. For example, a7 is commonly used as the stack pointer, and can also be referred to in code as ‘sp’.

Opcodes can perform operations using data from varying sources – main memory, one or more registers, or an immediate value (an integer, hex value, or binary value). Here’s a few examples of the MOVE opcode, it takes the first parameter, and moves it to the register or address in the second parameter:

 move.l #$10, d0   ; Moves the hex value 0x10 (decimal 16) to register d0
 move.l #%0101, d0 ; Moves the binary value 0101 (decimal 5) to register d0
 move.l #12, d0    ; Moves the decimal value 12 to register d0
 move.l d1, d0     ; Moves the value stored in register d1 to register d0
 move.l 0x8000, d0 ; Moves the value stored at address 0x8000 to register d0
 move.l d0, 0x8000 ; Moves the value stored in register d0 to address 0x8000
 move.l (a0), d0   ; Moves the value stored at the address in a0 to register d0
 move.l d0, (a0)   ; Moves the value stored in register d0 to the address stored in register a0

The first three examples show how to move immediate values to a register, signified by the # symbol before the value. An immediate value can be a hex value (prefixed with either $ or 0x), a binary value (prefixed with %), or a decimal value (no prefix). So to specify the immediate hex value 12, use #$12 or #0x12, to specify the binary value 0011 use #%0011, or for the decimal value 128 use #128. Example 4 shows how to move the contents of a register to another, and examples 5 and 6 show how to move the contents stored at an address in main memory to a register, and vice versa. Examples 7 and 8 show the same thing, but that main memory address is stored in the register a0. The brackets around register (a0) specify that the value at the address stored in a0 is to be moved, not the address itself, similar to the dereference operator in C/C++. Omitting the brackets would just move the address.

Not all opcodes can deal with data from all sources. Some can only operate on data in registers, some may or may not be able to use immediate values, and only select few opcodes can deal with data straight from main memory. A list of all of the 68k’s opcodes, including details of their usage and which source/destination values are permitted, are in the 68k Instruction Set PDF in the references section below.

The .l after the opcode is the size of the operation, in this case moving a longword of data (4 bytes). Opcodes can operate on bytes (.b, 8 bits), words (.w, 2 bytes) or longwords (.l, 4 bytes). Not all opcodes can operate on all data sizes, I’ve been checking the Instruction Set for which sizes are supported.

A few opcodes

I’ve been doing this for three months, and so far I’ve only used about 10 opcodes. It’s impressive how simple low-level computing like this can be, and even more impressive looking at some of the amazing games created with so few building blocks. Here’s a small guide to some of the opcodes I’ve found to be most useful:

ADD, SUB, MUL, DIV

The four basic arithmetic opcodes – add, subtract, multiply and divide. Add does exactly what it says on the tin. It adds the value in the first parameter (immediate value or register contents) to the register in the second parameter, and stores the result in that register. There’s a couple of variants of it – ADDI means add immediate, which only adds an immediate value to the contents of a register, ADDA adds a value to an address (NOT the value stored at the address, just the address itself), ADDQ which can very quickly add small immediate values (1 – 8), and ADDX which I’ve yet to figure out. There’s several variants because some are more expensive than others. I haven’t yet done any real optimisation to my code, but I guess paying attention to these small differences in opcode variants would be a good start when I get round to it. If I only needed to add 4 to a value, ADDQ would be faster than ADDI, for example.


add.l #0x10, d1  ; Adds the value 0x10 to register d1, and stores the result in d1
                 ; - longword size operation, so it uses all of the data in
                 ; the register

add.w d1, d2     ; Adds the contents of d1 to the contents of d2, and stores the
                 ; result in d2 - word size operation, so the top two bytes of both
                 ; registers are not referenced, and remain intact

addq.b #0x5, d3  ; Quickly adds 5 to the value in d3, storing the result in d3
                 ;  - byte size operation, so the upper three bytes are not
                 ; referenced, and stay intact

The last example is of byte size, so if d3 contained 0x000000FF the result would become 0x00000004, and would NOT roll to 0x00000100. It would need to be a word or longword size operation to do that.

MUL, SUB and DIV are used pretty much the same as ADD, and also have several variants. The Instruction Set doc shows each of their nuances and acceptable operation sizes.

CLR

CLR stands for clear. It sets a register (or data at an address), or part of a register depending on the operation size, to zero. It only takes one parameter, and that’s the register or address:

clr.l d0     ; Clears the whole of d0
clr.w (a0)   ; Clears the bottom word (2 bytes) of the data at the address in a0
clr.b d0     ; Clears the bottom byte of d0, leaving the rest intact

JMP

JMP means jump. It moves the program counter (the pointer to the current instruction) to another location, and continues executing. The address can be specified in hex, or more conveniently, using a label:

SomeLabel:
   jmp SomeLabel   ; An infinite loop!

JSR and RTS

JSR means to jump to subroutine. It does the same as JMP, but stores the original address of the program counter (by pushing it to the stack) before jumping, so that it can return later. RTS, meaning return to subroutine, pops the original address from the stack and does the jump back:

   move.l #0x8 d0   ; Do something useful
   jsr Label        ; Jump to Label
   move.l #0x12 d0  ; Will return here when RTS is called

Label:
   move.l #0x04, d0 ; Do something else
   rts              ; Return back

DBRA

This means decrement and branch. It does the same as a jump, but tests to see if a register is zero first. If that register is non-zero, it decrements that register by 1, and then branches. If the register is zero, it doesn’t branch, and just continues to the next line. It’s a common tool used for implementing loops:

   move.b #0x6, d0 ; Looping round 7 iterations (includes the 0th iteration)

Label:
   add.l #0x1, d1  ; Add 1 do register d1
   dbra d0, Label  ; Test to see if d0 is zero yet, and if not decrement it and
                   ; jump back up to Label
   clr.l d1        ; Loop has finished, clear d1

CMP and Bcc

Bcc, meaning branch on condition, is a collection of various branch opcodes which only branch if the condition code of the status register adheres to some condition. The status register seems to be the state of the CPU after an operation, and each opcode leaves its condition code in a different state after execution, as a sort of return value. For example, the CMP opcode (meaning compare) will store the result of subtracting two values into the status register’s condition code. After that, the Bcc variant BEQ (branch if equal to zero) can test the result of that comparison, and branch or not based on it. It’s a common way to implement an IF statement.

Here’s a demonstration of most of the above opcodes, including a CMP and BEQ. It’s a subroutine which counts the number of characters in a null-terminated string, by iterating through each byte and checking if it is 0, whilst keeping count of each iteration:

GetStringLength:
   clr.l d0          ; Clear d0, ready to begin counting

   @FindTerm:
   move.b (a0)+, d1  ; Move byte from address in a0 to d1, and then increment the address by 1 byte
   cmp.b #0x0, d1    ; Test if byte is zero
   beq.b @End        ; If byte was zero, branch to end
   addq.l #0x1, d0   ; Increment counter
   jmp @FindTerm     ; Jump back to FindTerm to loop round again

   @End:
   rts               ; End of search, return back. Result is in r0

Example usage:

   move.l #StringAddr, a0  ; Move address of string to a0
   jsr GetStringLength     ; Jump to the GetStringLength subroutine
                           ; Length of string will now be stored in d0

StringAddr:
   dc.b "HELLO WORLD", 0   ; A zero-terminated string

In the example, I’ve also introduced two new concepts. One is the + symbol after moving a value from (a0). This means post-increment; the address in a0 will be incremented by 1 byte after it has been read, similar to int a = b++ in the C++ language. The second concept is the @ symbol before the label FindTerm. This means the label is local – when referencing the label @FindTerm, it uses the address of the most recently defined @FindTerm label. This means you can have duplicate label names (loop could be a common name, perhaps) without any ambiguity.

That’s it for now. It doesn’t look like much, but I’ve managed to get as far as drawing text and sprites with no other opcodes than the ones listed, so they’re pretty powerful. There’s a few others I’ve touched briefly, like ROL and ROR, which shift bits left or right, but they don’t become useful until dealing with VDP addresses.
Matt.

References

Sega Megadrive – 1: Getting Started

My favourite videogames console of all time – the Sega Megadrive. I’ve been pretty excited about getting started on this machine for many years now, and has been the catalyst which finally kicked me into learning some assembly language.

Now, I’ve jumped the gun a bit, since I was originally planning to work on these platforms in chronological order, which means the Nintendo Entertainment System is going to wait in the queue for a while (don’t be fooled, the Sega Master System II was released AFTER the Megadrive, because Sega are nuts like that). I also don’t yet have any development hardware (I’m currently in negotiations with some sellers, though), so I’ll be starting out with a PC emulator with debugging features. The Sonic disassembly packages over at Sonic Retro contain a modified (fixed for Windows 7) version of SN Systems’ 68000 assembler, which was a low cost alternative to Sega’s tools at the time, and used by many Megadrive developers.

The point is, I’m just too excited to leave this console alone, and if anything will kickstart my motivation for this project with a flying leap it will be the Megadrive.

The Sega Megadrive technical specifications

A quick and naive list of the console’s basics, but it’s all I need to know to get started:

  • CPU: Motorola 68000 at 7.61 mhz
  • Slave CPU: Zilog z80
  • Main memory: 64kb
  • Video: Yamaha YM7101 VDP (Video Display Processor)
  • Video memory: 64kb
  • Audio: Yamaha YM2612 FM chip, Texas Instruments SN76489 chip
  • Game media: Cartridge
  • Programming language: 68k Assembler language
  • Known development hardware: Official Sega Genesis dev unit, Cross Products MegaCD unit

The Tools

As mentioned, I’ve managed to get hold of the SN Systems ASM68K assembler, a command line tool for MS-DOS. Since there was no official IDE or text editor included, nor can I find any clues as to what was commonly used at the time, I’ll be using Microsoft Visual Studio, simply because I’m familiar with its keyboard shortcuts.

Until I can get hold of some hardware, I’ll be making use of a PC emulator which has some debugging features. After some searching around, it seems the MAME emulator MESS does a good job, Gens with the KMod plugin is capable of debugging, and I’ve also had Regen recommended to me on the ASSEMblergames.com forums. I’m inclined to start with MESS since it uses the same debugging shortcut keys as Visual Studio.

Testing the Assembler

Since documentation seems scarce, I’ve used the Sonic the Hedgehog disassemblies from Sonic Retro as a guide. The package contains a batch file used to build the Sonic source, and the assembler bit simply boils down to:

ASM68K.EXE source.asm,destination.bin

Let’s test something out:

Loop:
   move.l #0xF, d0 ; Move 15 into register d0
   move.l d0, d1   ; Move contents of register d0 into d1
   jmp Loop        ; Jump back up to 'Loop'

…and that assembles just fine:

SN 68k version 2.53
Assembly completed.
0 error(s) from 4 lines in 0.1 seconds

I won’t pretend that I just came up with that assembly snippet like it was natural, it’s been a while since I last touched some 68k assembly (on the Atari STe) and it was the result of an hour or so of trawling through documentation and example code to refresh my memory!

The Megadrive ROM header

Unfortunately, it’s not as simple as loading up the generated ROM into an emulator and hitting Debug. A Megadrive ROM needs a header, which contains some meta info about the ROM, and a block of CPU vectors used to initialise the 68000 before the code gets executed. The header takes up 512 bytes at the very top of the ROM, and looks a little something like this:

	; ******************************************************************
	; Sega Megadrive ROM header
	; ******************************************************************
	dc.l   0x00FFE000      ; Initial stack pointer value
	dc.l   EntryPoint      ; Start of program
	dc.l   Exception       ; Bus error
	dc.l   Exception       ; Address error
	dc.l   Exception       ; Illegal instruction
	dc.l   Exception       ; Division by zero
	dc.l   Exception       ; CHK exception
	dc.l   Exception       ; TRAPV exception
	dc.l   Exception       ; Privilege violation
	dc.l   Exception       ; TRACE exception
	dc.l   Exception       ; Line-A emulator
	dc.l   Exception       ; Line-F emulator
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Spurious exception
	dc.l   Exception       ; IRQ level 1
	dc.l   Exception       ; IRQ level 2
	dc.l   Exception       ; IRQ level 3
	dc.l   HBlankInterrupt ; IRQ level 4 (horizontal retrace interrupt)
	dc.l   Exception       ; IRQ level 5
	dc.l   VBlankInterrupt ; IRQ level 6 (vertical retrace interrupt)
	dc.l   Exception       ; IRQ level 7
	dc.l   Exception       ; TRAP #00 exception
	dc.l   Exception       ; TRAP #01 exception
	dc.l   Exception       ; TRAP #02 exception
	dc.l   Exception       ; TRAP #03 exception
	dc.l   Exception       ; TRAP #04 exception
	dc.l   Exception       ; TRAP #05 exception
	dc.l   Exception       ; TRAP #06 exception
	dc.l   Exception       ; TRAP #07 exception
	dc.l   Exception       ; TRAP #08 exception
	dc.l   Exception       ; TRAP #09 exception
	dc.l   Exception       ; TRAP #10 exception
	dc.l   Exception       ; TRAP #11 exception
	dc.l   Exception       ; TRAP #12 exception
	dc.l   Exception       ; TRAP #13 exception
	dc.l   Exception       ; TRAP #14 exception
	dc.l   Exception       ; TRAP #15 exception
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)
	dc.l   Exception       ; Unused (reserved)

	dc.b "SEGA GENESIS    "									; Console name
	dc.b "(C)SEGA 1992.SEP"									; Copyrght holder and release date
	dc.b "YOUR GAME HERE                                  "	; Domestic name
	dc.b "YOUR GAME HERE                                  "	; International name
	dc.b "GM XXXXXXXX-XX"									; Version number
	dc.w 0x0000												; Checksum
	dc.b "J               "									; I/O support
	dc.l 0x00000000											; Start address of ROM
	dc.l __end												; End address of ROM
	dc.l 0x00FF0000											; Start address of RAM
	dc.l 0x00FFFFFF											; End address of RAM
	dc.l 0x00000000											; SRAM enabled
	dc.l 0x00000000											; Unused
	dc.l 0x00000000											; Start address of SRAM
	dc.l 0x00000000											; End address of SRAM
	dc.l 0x00000000											; Unused
	dc.l 0x00000000											; Unused
	dc.b "                                        "			; Notes (unused)
	dc.b "JUE             "									; Country codes

Note that the assembler requires code and data to be tabbed one to the right, I’ll look into why this is necessary at a later date. Labels, however seem happy with no tabs.

The top section is a block of CPU vectors, read in when the system boots, and are used to initialise various registers and interrupt addresses. The first longword is the value of the stack pointer register when the system boots, although the rest of the registers must be initialised manually so I’m confused as to why this one must be explicitly set. The EntryPoint is the address of the first line of code that gets run, and the majority of the rest point to an exception routine to catch errors. Eventually I plan to write a proper exception handler for each type of problem, and print to screen some information which would help me diagnose the issue.

The HBlankInterrupt and VBlankInterrupt are routines that get called when the electron beam in the TV reaches the right hand side of the screen, and when the beam hits the bottom right before switching off and moving back to the top left. I guess modern LCD and plasma  TVs don’t have this concept, but from the examples I’ve seen the timing for these interrupts being called is clock-accurate, so they’re perfect for implementing timers.

The second block is some information about the cartridge, hopefully the comments are self explanatory. The ROM/SRAM start and end addresses make sense to me since a cartridge and its savegame space (if any) can be of variable size, but I’ve yet to discover why the RAM start and end addresses need explicitly defining. The checksum is not read by the boot code itself and nothing is done with it, it’s only there for the programmer to implement a check if they wish.

All of the addresses can just be specified in hex, but the assembler allows for labels which makes things a great deal easier. EntryPoint, Exception, __end, HBlankInterrupt and VBlankInterrupt will need defining:

EntryPoint:
   Loop:
   move.l #0xF, d0 ; Move 15 into register d0
   move.l d0, d1   ; Move contents of register d0 into d1
   jmp Loop        ; Jump back up to 'Loop'

HBlankInterrupt:
VBlankInterrupt:
   rte   ; Return from Exception

Exception:
   rte   ; Return from Exception

__end    ; Very last line, end of ROM address

EntryPoint just loops around the little snippet I used to test out the assembler above. Both H/VBlankInterrupts and the Exception handler do nothing and return for now, I’ll experiment with those later. __end contains no code, it’s just a marker for the address of the last byte of the ROM. I’ve prefixed the label with underscores, simply to indicate that it’s not a subroutine and shouldn’t be called explicitly.
Ok, it should be ready to build and run!

Debugging the ROM

The ROM assembles with the ASM68k.EXE line demonstrated earlier. My chosen emulator, MESS, needs to be configured to enable the debugging features. After running MESS once, a mess.ini file is generated alongside the .exe, which contains a debug flag which can be set to 1. Now the ROM can be run using:

mess64.exe genesis -cart test.bin

MESS fires up, loads the ROM, and displays a debugging window. Unfortunately, I ran into a problem: the disassembly window shows garbage. The opcodes are mostly ‘ori’ and ‘illegal’, and I couldn’t make head or tail of my code:

After some digging around and tearing my hair out, the guys at ASSEMblergames.com pointed out that the first 15 bytes of my ROM didn’t belong there (I’m assuming the assembler added some sort of meta data to the start of the binary, perhaps for the SN debugger), and would need removing before it would work. After deleting those bytes using a hex editor (or assembling with the /p option), the ROM seems to work:

Much better, the opcodes are recognisable now. Time to test it out – MESS uses the same keyboard shortcuts for debugging as Visual Studio:

  • F9 – Set/unset breakpoint
  • F10 – Step over
  • F11 – Step into
  • SHIFT + F11 – Step out

So, after a single step (F10) the program counter moves straight to the address specified as the entry point in the header and executes it, and the value 0xF is moved into register r0. After a second step the contents of r0 (still 0xF) are moved into register r1, and after a third step the program counter is jumped back up to the first line again:

It’s not exactly Crysis, but it demonstrates that everything is in the right place and ready for the next part – initialising the Megadrive.

Matt

References