80x86 Memory Modes

This guide is broken into two sections: historical and practical. The historical section (1 and 2) explains how we got to where we are, and provides a technical explanation of the different memory models. The various memory models that x86 CPUs support is the reason why some operating systems can run DOS games, and others can't. If the historical section gets too technical, or you don't need a deep understanding of the memory models, you can skip to the practical section (3), which lists the various PC operating systems and how to run DOS games in them.

A Brief History of Computers Before DOS
The Memory Models

A Brief History of Computers Before DOS

CPUs used to come as DIPs (Dual In-line Packages), aka "caterpillars", which were long, rectangular chips with a row of pins along both of the long edges – typically between 20 and 40 – whereas they are now usually square and have hundreds of pins arranged in a grid on the bottom. Each pin serves a defined purpose, such as communicating with RAM or peripherals. The three types of "bit-edness" that are important to a CPU are the data bus, the address bus, and the size of the registers. The number of pins dedicated to the data bus determines how quickly the CPU can talk to peripherals: an 8-bit address bus allows data to be sent 1 byte at a time, while a 16-bit data bus allows 2 bytes to be sent at a time. The address bus is used to read or write an address in memory. The more address pins there are, the more memory can be used. 10 pins is enough to have 1024 memory addresses (2¹⁰ = 1 KB), and 20 pins is enough for 1 048 576 memory addresses (2²⁰ = 1 MB). Registers are internal storage locations where numbers are stored for calculations. The larger the register, the larger the number that can be used in the calculation. The size of the general purpose registers is usually what is meant when were describing a CPU in terms of "bits".

Computers have come a long way since the first commercial microprocessor was released on 15 November 1971. The 4004 was a 4-bit processor that ran at 740 KHz (0.74 MHz) and had 12 address bits so that it could access 4 KB of RAM. It was designed to power calculators, and is said to have had the processing power of the ENIAC supercomputer, built in 1946. The ENIAC used 18 000 vacuum tubes, weighed 27 tonnes, and occupied 680 square feet (63 m²), compared to the 2250 transistors of the 4004, which measured 1/8th of an inch by 1/6th of an inch. In 1974, the 4040 processor (500 to 740 KHz) increased memory support to 8 KB and added 60 new instructions to the instruction set.

Designed alongside the 4004 processor was its 8-bit big brother, the 8008, which was released in April 1972. It operated at 500 or 800 KHz, and had 14 address bits to access 16 KB of RAM. Originally commissioned for use in programmable terminals, Intel "accidentally" designed the 8008 to be fairly general purpose so that they could sell it to other customers.

Finally, the 8080 came along. It was ridiculously powerful. Operating at 2 MHz, it used 16-bit addresses, which allowed it to access 64 KB of memory! It had seven "general-purpose" 8-bit registers (A, B, C, D, E, H, and L), six of which could be combined into 16-bit registers (BC, DL, and EH) for some types of calculations. It was 10 times more powerful than an 8008, and its incredible success haunted DOS programmers. I'll explain that a bit later.

The Intel 8085 processor was designed to be a single-processor version of the 8080, which had needed supporting chips. The simplified design of the 8085 led to cheaper systems based on it. It was a bit slower, even though it operated at 3.07 to 5 MHz, achieving 0.37 MIPS, compared to the 0.64 MIPS of the 8080 processor. Fewer chips took up less space and required fewer circuits for communication between them, which allowed a greater variety of implementations, and it could be used in less expensive systems.

It's important to remember that Intel had a lot of competition in these days. Motorola was a heavyweight with their 6800-series of CPUs. In 1975, a company called MOS released a $25 CPU called the 6502. It wasn't compatible with any existing architecture but, at that price, it didn't need to be. It was used in hobbyist computers, such as the Apple I (1976) and Apple II (1977), Commodore PET (1977), and a little video game system called the Atari VCS (1977). It was later used in the Famicom/Nintendo Entertainment System (1983/1985), a modified version was used in the Commodore 64 (1982), and a next-generation version was used in the Super Famicom/Super NES (1990/1991). Remember Super Mario RPG? Those graphics were made possible by an extra chip in the cartridge called the Nintendo SA-1 (1995), which is a 10 MHz next-generation version of the 6502 processor. That's getting a bit ahead of ourselves, though.

The base of 8080/8085 users and programmers continued to grow, but the systems were too expensive to compete with home computers based on the 6502 processor. Then, a new company called Zilog released the Z80 processor in July 1976. It was inexpensive and powerful. It had all of the 8080's registers, and added two new 16-bit index registers (IX and IY) and two new 8-bit registers (I and R). It could perform all of the same operations, and the opcodes for those operations had the same numbers, so that it was binary compatible with the 8080 – that is, software compiled for the 8080 would run on a Z80 processor. It also had "shadow" registers for each of the 8080's general-purpose registers, which programmers could dump data into and retrieve in any order they wanted, instead of PUSHing them onto and POPping them off of the stack, which is last in, first out (LIFO). The Z80 ISA (Instruction Set Architecture) was a superset of the 8080 ISA – an enhanced 8080. It could run 8080 software as-is, and programmers could rewrite their code for the Z80 to use its extra capabilities to improve performance.

Lost yet? Don't worry about it. All you need to know is that the Z80 was incredibly popular, and was used in the Tandy TRS-80 in 1977. It was later used in the Osborne 1 portable computer (1981), Kaypro computers (1982), ColecoVision (1982), and Commodore 128 (1985). It also led to an explosion in the popularity of the CP/M operating system, which had originally been designed for 8080/8085 processors. The power of the Z80 processor, and the ease of use of CP/M, put 8080-compatible software on home computers around the world.

The decision that affects your ability to play DOS games on your computer occurred back in the late 1970s. While the 4040 can be seen as an enhanced 4004, and the 8085 is a single-chip implementation of the 8080, with the 8086, Intel added something to their new processor that they had never done before: backwards compatibility. There was no need for Intel's previous processors to be backwards compatible, because its processors were designed for proprietary computers, many of which were embedded systems that only ran firmware stored on ROM chips, and not software that could be written by third parties and stored on cassettes or floppy disks. The 8080/8085, on the other hand, was a popular programming platform, made even more popular by the Z80, and Intel didn't want to lose that user base. If it had been easier to port 8080 programs to new generations of Zilog processors than Intel processors, it would have been a huge boost for Zilog, making them the exclusive supplier of 8080-compatible CPUs. Intel needed to produce more powerful processors to stay competitive, but a new instruction set would require completely new software, and they would run the risk of having the new architecture not being adopted.

So, Intel designed their first 16-bit processor, the 8086, to be easy to port source code written for the 8-bit 8080. It operated at 5, 8 or 10 MHz and had 16-bit registers, four of which could be addressed as two 8-bit halves, so that software written for the 8080's 8-bit registers could be easily rewritten to use the 8086's 8-bit registers. It also increased the address bus to 20-bits so that it could address 1 MB of RAM. But how could they design a CPU that could run software designed for 16-bit addresses, while at the same time allowing new programs to use 20-bit addresses? That's where real mode comes in.

The Memory Models

Real Mode

The smallest addressable chunk of data in a computer is called a byte. How many different values can be stored in a byte depends on how many bits there are in a byte. The standard is 8-bits per byte, so a byte can store 256 different values.

Every byte needs an address so that it can be found in memory. The more bytes you can address, the larger the addresses become. Having addresses for each byte in 1 MB of data requires a 20-bit address.

CPUs have a register that stores the address of the next instruction to execute, usually called the Program Counter (PC) or Instruction Pointer (IP). Most also have a Stack Pointer (SP) to keep track of the top of the stack. They may also have other index registers that programmers can use to keep track of additional addresses. Logically, registers that store memory addresses need to be the same size as the address bus. Not so with the 8086!

Instead of a 20-bit instruction pointer, Intel made it necessary to combine two 16-bit registers to cover the 20-bit address range. And it wasn't simply a matter of using the bottom four bits of one register to extend the address range of the other register. That would have made life too easy! Instead, the 8086 uses segmented addresses. The IP register now held a 16-bit address, and a second 16-bit "segment selector" register stored a memory offset. Ready to get confused?

A 16-bit register can hold 2¹⁶, or 65 536, different values. The 8086 has a 20-bit address bus, so there are 2²⁰, or 1 048 576 different addresses. Divide the larger number by the smaller number and you'll get 16 (2⁴). In other words, 2²⁰ is 2⁴ (16) times larger than 2¹⁶. That's important.

A 16-bit register can store 65 536 offsets. If you need a 20-bit address, those offsets can represent a sort of bookmark every 16 bytes. Under the segmented address system, 64 KB segment begins every 16 bytes. The instruction pointer represents an address within a 16-bit segment, and an offset is used to determine where the beginning of that segment can be found in memory. The true, 20-bit address is the offset ×️ 16, plus the segment address. In alegebraic terms, we'll call the two 16-bit values A and B. The 20-bit value can be calculated with the formula 16A + B. For simplicity's sake, memory addresses are described in hexadecimal (base 16). It takes four hexadecimal digits to describe 16-bits, so a segmented memory address is described using four hex values for the offset and four hex values for the segment address, separated by a colon. Possible values range from 0000:0000 through FFFF:FFFF.

To make things extra fun, this means that there are 4096 ways to record the same address! When the offset goes up or down by 1, the segment address needs to go down or up by the same amount ×️ 16 (hexadecimal 10). You can drive yourself to a panic attack thinking about it!

Okay, enough horror stories. The important thing to know is that the 8086 used a memory model known as Real Mode, which used 20-bit segmented addresses that were composed of a 16-bit address and a 16-bit offset. To make it easy to port software that used 16-bit addresses and also make it possible to create new software that could use more than 64 KB of RAM, DOS supported two types of programs: .COM and .EXE.

The CP/M operating system was developed for the Intel 8080, and executable files used the ".COM" extension. COM probably stood for "compiled", to distinguish them from .ASM (Assembly) files, which were source code waiting to be compiled, and .BAS (BASIC) files, which were source code that was usually not intended to be compiled, but would be executed by a BASIC interpreter. (COM had also stood for "command" on some DEC computers at the time, which were text files with instructions to be executed in order – what modern operating systems would call a "batch" file or "script".)

Under DOS, COM files worked just like CP/M .COM files. They contained no header or metadata; just code and data. Like COM files for the 8080 and Z80, they use only the 16-bit Instruction Pointer to address 64 KB of memory, which allowed programs written for 8080-compatible CPUs to be easily ported to the 8086. COM files don't care where their 64 KB page of RAM is located; the operating system can reserve 64 KB anywhere it wants.

Programs that need more than 64 KB of RAM can be compiled as EXE (executable) files. Executable files can access more than 64 KB of memory by using segmented addresses.

Both COM and EXE files executed in Real Mode, but only EXE files could take full advantage of Real Mode.

Incidentally, even EXE files couldn't use the entire 1 MB of memory for themselves. DOS only reserved 640 KB for programs, and the remaining 384 KB was reserved for the BIOS and add-in hardware such as graphics cards. The first 640 KB is known as conventional memory, and the last 384 KB is known as upper memory. It's famously claimed that Bill Gates said, "640K of RAM ought to be enough for anybody." This is usually misunderstood to mean that Gates thought that no one would ever need more than 640 KB of RAM. If he said it – and reliable sources say that he did – what he meant was that 640 KB of the available 1024 KB ought to be enough for anyone. It was, after all, 10 times more than had been available on 8080/Z80-based systems. A decision had to be made about how much memory to make available to programs and how much to reserve for hardware, and a 640/384 split was reasonable. Programmers still didn't have access to the full 640 KB of conventional memory because that was the memory space set aside for all programs, including the operating system and any other programs that were running in the background (such as drivers and other TSRs). Later versions of DOS could try to load themselves into upper memory to leave more conventional memory free for programs, but eventually 640 KB of RAM wasn't enough any more.

Standard Mode

In 1982, Intel released the 80286, which was still backwards compatible with the 8086. The success of the IBM PC ensured that every generation of Intel CPU would have to retain backwards compatibility and, to this day, a Core i9 or a Ryzen 9 processor use the same instructions and registers as the 8086, although they also support a great many new instructions.

Operating at 6, 8, and 12.5 MHz, the 286 also significantly increased performance per clock and added the ability to read 24-bit memory addresses, allowing 16 MB of RAM.

It also introduced a new "protected" memory model which came to be known as Standard Mode. The idea was that the operating system could reserve memory for a specific program so that no other program could access or overwrite that memory space, thus protecting programs from each other. This would allow multi-tasking – that is, the use of multiple programs at the same time. This gave desktop computers similar capabilities to mainframe computers, which made IBM very unhappy! Protected mode eliminated the 640 KB memory limitation. Unfortunately, the 286 still had 16-bit segment registers, so it still required segmented addressing.

Backwards compatibility was maintained by having the 286 enter Real Mode when it was powered up, and a compatible operating system could then put the CPU into the new protected mode to take advantage of the 286's extra capabilities, such as the ability to access 16 MB of RAM. Presumably, a tight transistor budget necessitated a limitation that prevented Standard Mode from catching on: you had to reboot your computer to return to Real Mode! This meant that any protected mode operating system would prevent you from using your existing Real Mode software, so new versions of DOS never used the new mode!

Since Real Mode was the only mode prior to the 286, Real Mode got its name when the 286 came along.

Extended memory

Computers could now have more than 1 MB of RAM, but no one wanted to use the 286's new protected mode. Real Mode programs needed a way to access the extended memory beyond the first 1 MB.

Programs could accomplish a lot in 640 KB of code, but they might need extra memory to store data, such as graphics, music, and level maps. The Extended Memory Specification (XMS) allowed Real Mode software to access data in extended memory, but it couldn't point the Instruction Pointer into extended memory to allow code to be executed there. As ever-more capable software developed a need for ever-more memory, programs were still limited to less than 640 KB of executable code.

Other programs, like mouse, display, sound card, printer, and CD-ROM drivers could be run before starting a Real Mode program and could remain active. These were known as Terminate and Stay Resident programs (TSRs). Real Mode programs could access the computer's hardware directly, although they didn't always need to. Programmers didn't need to know how to control a floppy drive in order to write programs that could read and write files, because the operating system already did that. Instead, the programmer would place an interrupt into the program that called the operating system's disk access routines. It could also access the graphics card to take advantage of new graphics standards like MDA, CGA, Hercules, EGA, MCGA, VGA, 8514, and a number of competing Super VGA (SVGA) standards. Standards like AdLib, Gravis Ultrasound, and SoundBlaster brought music, voice, and MIDI instruments to DOS software.

Thanks to generations of backwards compatibility, as well as extended memory managers, Real Mode software evolved from programs that ran inside 64 KB of RAM, displayed monochrome text, produced a single note at a time using the PC speaker, and used a keyboard for input, to programs that could access 16 MB of RAM, displayed images at 1024×768 with 256 colors or higher, played dozens of notes and voices simultaneously, and could use a mouse or joystick for control. DOS games were sold on everything from 360 KB 5.25" floppy disks to 650 MB CD-ROMs. The base of Real Mode software was so great that every version of DOS supports Real Mode exclusively, even when alternatives arrived.

Protected Mode

Real Mode programs were designed to run one at a time, with no multi-tasking. XMS only allowed data (text, sounds, graphics, etc.) to be stored beyond the first megabyte of RAM; it could not execute code stored in that space. Intel's first attempt to allow multi-tasking and eliminate the 640 KB memory limit failed when the 286's protected mode didn't catch on.

In 1986, Intel released the first 32-bit x86 CPU, the 80386. The 386 extended the general purpose registers from 16 to 32-bits, but continued to allow the lower 16-bits to be addressed by software written for the 8086 through 80286, and the 16-bit registers could still be broken into 8-bit registers for software ported from the 8080. The 386 also increased the address bus to 32-bits, allowing it to access up to 4 GB of RAM. The pointer and segment registers were also increased to 32-bits, so that segmented memory addresses were no longer necessary. Each byte of memory simply had a single, unique, "flat" address from 0000 0000 to FFFF FFFF. (Notice that a 32-bit flat address is no longer than a 20-bit segmented address! It's still 8 digits; you just don't need the colon any more.) 4 GB was way more memory than anyone could have or need at the time, so 32-bits was presumed to be enough to future-proof the architecture for a very long time. It was. Until the 2010s, very few PCs had more than 4 GB of RAM.

32-bit operation required the use of a new 32-bit protected mode. Like the 24-bit protected mode on the 286, software could use the entire memory space for data and code, so there was no need for hacky solutions like EMS and XMS. Unlike the 286, the 386 could also run Real Mode software while in protected mode through a new mode-within-a-mode called Virtual 8086 Mode.

The 24-bit, segmented protected mode of the 286 came to be known as Standard Mode to differentiate it from the 32-bit flat memory model introduced with the 386 that we call Protected Mode.

Long Mode

On 22 April 2003, the first 64-bit x86 CPU, the Opteron, was released, followed by the desktop variant, the Athlon 64, on 23 September 2003. Intel later released 64-bit Xeon and Pentium 4 processors in 2004. Known as x86-64 processors, or just x64, these processors extended the general purpose registers to 64-bits, but they are still sub-divided into 32, 16, and 8-bit registers for backwards compatibility. 64-bit operation required a new mode, called long mode. Long Mode is a protected memory model, but uses a different name to avoid confusion with 32-bit Protected Mode.

Since 32-bit CPUs had 32 address bits, you might expect x86-64 CPUs to have a 64-bit address bus. 64 address bits would be able to address 16 EB (exabytes) of RAM! That's about 18.4 quintillion bytes. Early 64-bit CPUs only actually had 40 address bits, which limited them to 1 TB of RAM. The architecture was designed to move to a 56-bit address bus in the future to support up to 4 PB (petabytes) of RAM. Some server CPUs – such as Intel Xeon and AMD EPYC – can now have multiple terabytes of RAM.

One unfortunate decision that AMD made in the development of Long Mode is that it's not possible to enter Virtual 8086 Mode from Long Mode. 64-bit CPUs boot into Legacy Mode, which contains all of the modes from 32-bit x86 CPUs. Just as before, they intially boot into 16-bit Real Mode, and a 32-bit operating system can put them into Protected Mode and enter Virtual 8086 Mode to run 16-bit software. A 64-bit operating system will, instead, put the CPU into 64-bit Long Mode. Long Mode has a Compatibility Mode for running 32-bit software, but can't enter Legacy Mode to use Virtual 8086 Mode to run 16-bit Real Mode software. If you want to run 16-bit software from a 64-bit OS, you'll need to run special software that allows it to be run in Long Mode. Some examples are DOSBox, DOSEmu, and Wine. Alternatively, you can run a 16- or 32-bit operating system in an emulator or virtualization suite, such as VirtualBox, VMWare, or Virtual PC.