Assembly: the standard machine

This crazy thing we call the PC has been around for over 30 years now, and it’s quite far from its original roots as IBM’s business-oriented machine. The numbers involved are mind-boggling in their differences. The first PC ran at 4.77 MHz, while today’s can top 4 GHz. (And, thanks to parallelization, cache, and all those other advances we’ve seen, that adds up to far more than a 1000x increase in computing power.) Intel’s 8086 processor could address 1 MB of memory; modern computers have multiple megabytes inside them. Back then, the primary medium for storage was the floppy disk, and the PC’s disks weighed in at a hefty 360 KB; nowadays, you can find USB flash drives capable of holding more than a million times that.

And people still managed to get things done with machines that are easily outdone by our watches, our refrigerators, our thermostats. A lot of people today can’t even comprehend that, including many—I know quite a few—who used to get by without computers at all! But histories of the PC are everywhere, and we have a specific reason for our trek down memory lane. So let’s take a look at these ancient contraptions from the assembly programmer’s point of view.

The ideal PC

Since we’re using an emulator for our programming, we don’t need to go into a lot of hardware specifics. It doesn’t particularly matter for the purposes of this series that the 8088 had only an 8-bit bus, or that the 486SX lacked a built-in floating-point unit. (I had one of those on my first PC. It sucked.) From our level, those details are mostly irrelevant. What does matter is, for lack of a better term, the hardware interface.

For this series, I’m assuming an “ideal” PC loosely based on real ones. It’s not an original PC, nor is it an XT, AT, or something later. It’s really PC compatible, but nothing more. It roughly conforms to the specs, though some things might be different. That’s intentional, and it’s for ease of explanation. (It also means I can reuse knowledge from all those old DOS tutorials.)

Our idealized machine will use a 286-level processor running in real mode. That lets us focus on a simplified architecture (no protected mode, no 32-bit stuff) that isn’t too far off from what people were using in the late 80s. It does mean limitations, but we’ll work around them.

We’ll also assume that we have access to VGA graphics. That was not the case for the early 286-based PCs; they had to make do with EGA and its 16 colors, at best. Again, this makes things easier, and you’re free to look up the “real thing” if you wish.

Finally, we’re using FreeDOS in the v86 emulator until further notice. It’s not exactly Microsoft’s old DOS from the days of yore, but it’s close enough for this series. Basically, assume it has the features a DOS is supposed to have.

Now that that’s out of the way, let’s see what we have to work with.

Memory

We effectively have a 286 processor, which we looked at last week. Since we’re ignoring protected mode entirely, that means we have access to a total of 1 MB of memory. (Note that this is a real megabyte, not the million bytes that hard drive manufacturers would have you believe.) That memory, however, is divided up into various portions.

  • The first 1 KB is reserved by the CPU as the interrupt vector table. The x86 allows 256 different interrupts, whether caused by hardware or software. This table, then, holds 256 4-byte addresses, each one a pointer to an interrupt handler. For example, the core DOS interrupt, 21h, causes execution to jump to the interrupt routine whose address is stored at 0x21 * 4 = 0x84, or 0000:0084.

  • The next 256 bytes, starting at 0040:0000, are the BIOS data area. (I’ll explain the BIOS in a bit.) Much of this space has special meaning for the BIOS, so it’s a bit like the zero page on most 6502 computers.

  • DOS, the BIOS, and the PC all use bits of the next 256 bytes, starting at 0050:0000.

  • DOS effectively begins loading at 0060:0000, though how much memory it uses from here depends on a wide variety of factors.

Whatever is left after this is available for program use, up to the PC’s RAM limit. Everything else in the address space, starting at A000:0000, is given to the various types of adapters supported by the architecture. Even on later systems with expanded memory capabilities, this conventional memory limit of 640 KB remained. Yes, the original PC was limited to 640K of RAM—leading to the famous quote Bill Gates may or may not have uttered—but this wasn’t as stringent as it seems; the first IBM models only had 256K, and that was still four times what the 6502 could understand.

The remaining 384 KB of the old PC’s range was for the display, the BIOS, and any expansions that may have been around:

  • A 64K block starting at a000:0000 is the video buffer for color graphics modes, including the VGA-alike our idealized PC uses. (Modern graphics cards use high-speed direct memory accesses, but this area is still there for “fallback” mode.)

  • The next 32K, from b000:0000, is used by monochrome video adapters for text. Some systems that neither had nor supported monochrome used this space as extra memory.

  • Another 32K block, starting with b800:0000 or b000:8000 (they’re the same thing), is a very big deal. It’s where the color text-mode memory lies, so manipulating this area will directly affect the contents of the screen. It’s a linear array of words: the low byte holds the character, the high byte its foreground and background colors. We haven’t seen the last of this space.

  • Starting at c000:0000, things get murkier. Usually, the 32K starting here is where the video BIOS lives. After that is the hard disk BIOS, if there is any. Everything from d000:0000 to f000:0000 was intended for expansion; add-ons were expected to set themselves up at a particular place and define how much memory they were going to use. Of course, conflicts were all too easy.

  • The last big block of memory begins at f000:0000. It’s meant for the BIOS, though the original one only used the last 8K (f000:e000 and up). By the way, the processor starts at f000:fff0, which is why the first BIOS went at the end of the block.

  • Finally, the last segment, ffff:xxxx, wraps around to zero on an 8086, but it can access a little bit of higher memory on the 286 and later. That’s the high memory area that DOS users learned so much about back in the day. It won’t really concern us much, but it’s good to know it exists, especially if you’re reading older literature that refers to it.

The BIOS

BIOS (Basic Input/Output System) is the name given to the PC’s internal code, as well as the programmer-accessible interfaces that code provides. The BIOS is mainly responsible for taking a bare x86 processor and turning it into a usable system; in that, it’s like the system ROMs of older computers. Different BIOSes over the years have offered numerous features—some of them can now run entire operating systems, while Coreboot is an OS—but they all also contain the same basic functions that you had 30 years ago.

Rather than using a vector table, like that of a Commodore 64, the PC BIOS provides access to its API through interrupts. By placing certain values in registers before invoking the interrupt, you can select different functions and pass arguments to them. Return values would likewise be placed in registers, sometimes the same ones. As an example, BIOS interrupt 10h controls the video system, and one of its available functions is setting the cursor position. To do that, we need to load the number 2 into AH, among other things, as in this snippet:

curs_set:
    mov ah, 2   ; function number
    mov bh, 0   ; "page number": 0 is the "main" page

    ; these would probably be previously defined
    mov dh, row
    mov dl, column

    int 10h

We’ll be seeing the BIOS later on, but we definitely won’t meet more than a fraction of its interface.

Peripherals and I/O

The PC was designed with peripherals in mind. Floppy disks, hard disks, printers, and many other things came to be attached to these computers. Usually, they came with drivers, and the BIOS could talk to some of them, but programmers occasionally needed to access them directly. Nor were these methods mutually exclusive. After all, it was entirely common to bypass the BIOS and write directly to video memory, because that was far faster.

Under our assumptions, we’ve got a VGA graphics adapter. We’ve got our keyboard and mouse, which the emulator…well, emulates. The FreeDOS image we’ll be using includes one floppy disk, but that’s it. Other than that, we have the PC speaker and a few other odds and ends. Not much, but plenty to explore. But that’s for later.

The operating system

Today, we use Windows or Linux or OSX or whatever, and we only really think about the operating system when it won’t work. But almost every home PC of the late 80s ran DOS first, and that’s what people learned. FreeDOS is like an extension of that, an upgraded version that can run on modern machinery. It uses the same interfaces for “legacy” code, however, and that’s what we want. And v86 includes a preset FreeDOS disk image for us, so that’s a bonus.

DOS, as derided as it is today, actually packed a lot of functionality into its tiny innards. Sure, it didn’t have true multitasking, and you needed hacks to use more than 1 MB of memory. Some of its internal functions were slow enough that it was better to write to the hardware directly. And it even left some things up to the BIOS. But it was far better than nothing.

I could make this a “write a 16-bit OS” series, but I won’t. Because I don’t have to. That’s one of the few good things about DOS: it’s so simple that you can almost ignore it, but you can still use it when it’s what you need. So that’s what I’ll be doing. I mean, it worked for fifteen years, right?

And so on

There’s lots more I could say about the PC platform, whether the real thing or our nostalgic remembrance. But this post is already getting too long. Everything else can come up when it comes up, because now we have to move on to the whole reason for this series: the x86 assembly language. Next time, you’ll see what it looked like (not what it looks like now, though, because it’s totally different). Maybe we can even figure out why so many people hated it with such passion. Oh, and we’ll have some actual code, too.

Leave a Reply

Your email address will not be published. Required fields are marked *