80x86 Memory Modes
This guide is broken into two sections: historical and practical. The historical section (1 and 2) explains how we got to where we are, and provides a techincal explanation of the different memory models. The various memory models that x86 CPUs support is the reason why some operating systems can run DOS games, and others can't. If the historical section gets too technical, or you don't need a deep understanding of the memory models, you can skip to the practical section (3), which lists the various PC operating systems, and how to run DOS games in them.
- A Brief History of Computers Before DOS
- The Memory Models
- Real mode
- Protected mode
- Long mode
A Brief History of Computers Before DOS
Computers have come a long way since the first commercial microprocessor was released on 15 November 1971. The 4004 was a 4-bit processor that ran at 740 KHz (0.74 MHz) and used 12-bit memory addresses to access 4 KB of RAM. It was designed to power calculators, and is said to have had the processing power of the ENIAC supercomputer, built in 1946. The ENIAC used 18 000 vacuum tubes, weighed 27 tonnes, and occupied 680 square feet (63 m²), compared to the 2250 transistors of the 4004, which measured 1/8th of an inch by 1/6th of an inch. In 1974, the 4040 processor (500 to 740 KHz) increased memory support to 8 KB and added 60 new instructions to the instruction set.
Designed alongside the 4004 processor was its 8-bit big brother, the 8008, which was released in April 1972. It operated at 500 or 800 KHz, and used 14-bit addresses to access 16 KB of RAM. Originally commissioned for use in programmable terminals, Intel "accidentally" designed the 8008 to be fairly general purpose so that they could sell it to other customers.
Finally the 8080 came along. It was ridiculously powerful. Operating at 2 MHz, it used 16-bit addresses, which allowed it to access 64 KB of memory! It was 10 times more powerful than an 8008, and its incredible success haunted DOS programmers. I'll explain that a bit later.
The Intel 8085 processor was designed to be a single-processor version of the 8080, which had needed supporting chips. The simplified design of the 8085 led to cheaper systems based on it. It was a bit slower, even though it operated at 3.07 to 5 MHz, achieving 0.37 MIPS, compared to the 0.64 MIPS of the 8080 processor. It took up less space, which allowed a greater variety of implementations, and it could be used in less expensive systems.
It's important to remember that Intel had a lot of competition in these days. Motorola was a heavyweight, and in 1975 a company called MOS released a $25 CPU called the 6502 processor. It wasn't compatible with any existing architecture, but at that price, it didn't need to be. It was used in hobbyist computers, the Apple I (1976) and Apple II (1977), Commodore PET (1977) and a little video game system called the Atari VCS (1977). It was later used in the Famicom/Nintendo Entertainment System (1983/1985), a modified version was used in the Commodore 64 (1982), and a next-generation version was used in the Super Famicom/Super NES (1990/1991). Remember Super Mario RPG? Those graphics were made possible by an extra chip in the cartridge called the Nintendo SA-1 (1995), which is a 10 MHz next-generation version of the 6502 processor. That's getting a bit ahead of ourselves, though.
The base of 8080/8085 users and programmers continued to grow, but the systems were too expensive to compete with home computers based on the 6502 processor. Then a new company called Zilog produced the Z80 processor in July 1976. It was inexpensive, and it was powerful. It had all six of the 8080's registers (AF, BC, DE, HL, SP, PC), and added four new ones (IX, IY, I, R). It also had four "shadow" registers for each of the AF, BC, DE, and HL registers, which programmers could dump data into and retrieve in any order they wanted, instead of dumping them into the stack, which is last In, first out (LIFO). Lost yet? Don't worry about it. All you need to know is that the Z80 was incredibly popular, and was used in the Tandy TRS-80 in 1977. It was later used in the Osborne 1 portable computer (1981), Kaypro computers (1982), ColecoVision (1982) and Commodore 128 (1985). It also led to an explosion in the popularity of the CP/M operating system, which had originally been designed for the 8080/8085 processors. The power of the Z80 processor and the ease of use of CP/M put 8080-compatible software on home computers around the world.
The decision that affects your ability to play DOS games on your computer occurred back in the late 1970s. While the 4040 can be seen as an enhanced 4004, and the 8085 is a single-chip implementation of the 8080, with the 8086 Intel added something to their new processor that they had never done before: backwards compatibility. There was no need for Intel's previous processors to be backwards compatible, because its processors were designed for proprietary computers, some of which didn't really use software, such as calculators that stored their logic on ROM chips, not floppy disks. The 8080/8085, on the other hand, was a popular programming platform, made even more popular by the Z80, and Intel didn't want to lose that user base. If it had been easier to port 8080 programs to new generations of Zilog processors than Intel processors, it would have been a huge boost for Zilog, making them the exclusive supplier of 8080-compatible CPUs. Intel needed to produce more powerful processors to stay competitive, but a new instruction set would require completely new software, and they would run the risk of having the new architecture not being adopted.
So, Intel designed their first 16-bit processor, the 8086, to be easy to port programs written for the 8-bit 8080. It operated at 5, 8 or 10 MHz and used 16-bit registers, some of which could be addressed as two 8-bit halves in order to make it easier to port 8080 software that used 8-bit registers. It also needed to be able to use more memory because 64 KB of RAM wasn't enough any more. The 8086 used 20-bit addresses to access 1 MB of memory. How could they design a CPU that would run software that expected used 16-bit addresses, while at the same time allowing new programs to address 20? And that's where real mode comes in.
The Memory Models
Real mode used segmented addresses. Originally, there were two kinds of programs: .COM and .EXE.
COM (command) files run inside 64 KB of memory, which allowed programs written for 8080-compatible CPUs to be easily ported to the 8086. COM files don't care where their 64 KB page of RAM is located; the operating system can reserve 64 KB anywhere it wants. Programmers can use a single 16-bit address to refer to locations within the memory page, and the operating system will keep track of where that 64 KB is located.
Programs that need more than 64 KB of RAM can be compiled as EXE (executable) files. Executable files can access more than 64 KB of memory by using the segmented address. Memory was broken into two 16-bit addresses: one to describe a location within a 64 KB page of memory, and the other to describe how far the beginning of that page was from the beginning of RAM. A brief explanation of memory addresses will make things a bit more clear.
Computers store data in binary because every number can be broken into digits that are either 0 or 1, as opposed to the decimal system where numbers are broken into the ten digits, 0 to 9. Binary allows computers to store data using "states" to represent 0 or 1, and those states can be read very quickly, and changed very easily. For instance, the first programmable computers used switches which the programmer could flick on or off to represent 0 or 1. Floppy drives store data magnetically, using north and south polarity as 0 and 1. CDs store data reflectively, using "lands" and "pits" to either reflect a laser beam or diffuse it to represent 0 or 1. RAM stores data electrically by having transistors that are either charged or not charged to represent 0 or 1. Modern hard drives use spintronics to use the direction of electron spin to represent 0 or 1. Fibre optic cable transmits data photonically, using the presence or absence of light to represent 0 or 1. Early computers used vacuum tubes to hold or not hold a charge to represent 0 or 1. The discovery of semiconductors led to the integrated circuit, which allows transistors to either conduct or not conduct electricity to represent 0 or 1. However it's accomplished, the result of any query is as simple as any answer can ever be: on or off, true or false, 0 or 1.
The switches on switchboard computers were replaced by transistors in integrated circuits, but however it's accomplished, the "switches" are arranged in logic gates, which allows any group of 0s or 1s to pass through a series of transistors, and boolean arithmetic determines what number will come out. Logic gates can be arranged so that they can perform addition and subtraction, perform boolean arithmetic, compare two numbers, shift the digits of a number left or right, or just about anything else. These logic gates represent instructions which, collectively, become the instruction set for any CPU architecture.
Binary is great for computers and their ability to quickly use true or false statements to calculate 0s and 1s. They're not so great for humans, because the numbers become quite large. The largest address that a 16-bit address allows is 1111111111111111 in binary. It's awkward to translate binary into decimal because 2 doesn't go into 10 very well. In decimal, the largest 16-bit number is 65 536. That's why programmers use hexadecimal, which is base 16. Numbers go from 0 to 9, and then from A to F. A single hexadecimal digit can represent four binary digts, known as a nibble. This makes binary numbers four times smaller, and turns 1111 1111 1111 1111 into FFFF.
So, let's get back to addresses. 20 binary digits (bits) can be described in 5 hexadecimal digits, such as FFFFF. The 8080 used 16-bit addresses going from 0000 to FFFF, and they could be ported to COM files and continue to use addresses in that range. EXE files could have just used flat 20-bit addresses that went one nibble higher than the addresses that COM files used, but that would have required a programming workaround that would have slowed programs down. 8086 CPUs retrieve data 16-bits at a time and store the results in 16-bit registers, so Intel decided to use two 16-bit addresses to describe where an address is within a 20-bit address space. A segment address of F000 tells me that the memory location is the C000th byte from the offset address, and an offset address of A000 tells me that the page begins at A000. It's like saying, "I can't tell you how far away Sydney is from London because it's more than 9999 miles away, and I can only use 4 digits. I can tell you that London is 5945 miles from Tokyo, and Tokyo is 4869 miles from Sydney." Not exactly because it wouldn't be a straight line, but you get the idea.
So, if I have an address of C000:A000, I'm saying that the address is C000 bytes from A000. Since we're using a total of 32 bits to describe a 20-bit address, there is significant overlap. I could describe the same memory location as D000:B000, or E000:C000, or F000:D000. I could decrease the offset and describe the same location as B000:9000, or A000:8000, or 9000:7000. In fact, there are 2^16 ways that I could describe the same memory location. The computer can very quickly add the two numbers together to produce the correct address, but having to create addresses by adding two numbers together, and having multiple ways of referring to the same address, made things confusing for programmers. The first headache of the backwards compatible era was born.
Anyway, this was called real mode. Real mode assumes that you have no more than 1 MB of RAM, and you have to use a segmented address to describe memory locations. Operating systems that use real mode also need to be able to handle COM files that will only use a segment address, not an offset address, so the operating system needs to pick an offset address and keep track of it in order to run COM files.
CP/M was ported to the 8086, but some of the command names weren't obvious. The 8086 was too expensive to use in computers that were designed to compete with 6502 and Z80-based processors, which had created an enormous market for home computers. Intel made a cheaper version of the 8086 called the 8088, which operated at 4.77 MHz and reduced the data bus from 16-bits to 8-bits. IBM selected it for its Personal Computer, the first "PC", and a little company called Microsoft developed an operating system known as MS-DOS.
1 MB was a lot of memory at the time, but DOS only reserved 640 KB for programs, and the remaining 384 KB was reserved the BIOS and add-in hardware such as graphics cards. The first 640 KB is known as conventional memory, and the last 384 KB is known as upper memory. It's famously claimed that Bill Gates said, "640K of RAM ought to be enough for anybody." Later versions of DOS could try to load themselves into upper memory to leave more conventional memory free for programs, but eventually 640 KB of RAM wasn't enough any more.
In 1982, Intel released the 80286, which was still backwards compatible with the 8086. The success of the IBM PC ensured that every generation of Intel CPU would have to retain backwards compatibility and, to this day, a Core i7 or a Phenom II processor use the same instructions and registers as the 8086, although they also support a great many new instructions.
Operating at 6, 8, and 12.5 MHz, the 286 also significantly increased performance per clock and added the ability to read 24-bit memory addresses, allowing 16 MB of RAM. This, again, created a problem with backwards compatibility with software that used 20-bit segmented addresses. Real mode programs needed a new way to access data beyond the 1 MB barrier, and extended memory was born.
Programs wouldn't normally need more than 640 KB of code, but they might need extra memory to store data, such as graphics, music, and level maps. The Extended Memory Specification (XMS) allowed Real mode software to access data -- but not executable code -- from the extended memory space by using a special instruction called an interrupt, which temporarily lets some other program (such as the operating system) run some code before returning control to the program.
Other programs, like mouse, display, sound card, printer, and CD-ROM drivers could be run before starting a Real mode program and could remain active. These were known as Terminate and Stay Resident programs (TSRs). Real mode programs could access the computer's hardware directly, although they didn't always need to. Programmers didn't need to know how to control a floppy drive in order to write programs that could read and write files, because the operating system already did that. Instead, the programmer would place an interrupt into the program that called the operating system's disk access routines. It could also access the graphics card to take advantage of new graphics standards like MDA, CGA, Hercules, EGA, MCGA, VGA, 8514, and a number of competing Super VGA (SVGA) standards. Standards like AdLib, Gravis Ultrasound, and SoundBlaster brought music, voice, and MIDI instruments to DOS software.
Thanks to generations of backwards compatibility, as well as solutions that allowed greater and greater amounts of memory to be accessed on CPUs that supported larger addresses, real mode software evolved from programs that ran inside 64 KB of RAM, displayed monochrome text, produced a single note at a time using the PC speaker, and used a keyboard for input, to programs that could access 16 MB of RAM, displayed images at 1024×768 with 256 colors or higher, played dozens of notes and voices simultaneously, and could use a mouse or joystick for control. DOS games were sold on everything from 360 KB 5.25" floppy disks to 650 MB CD-ROMs. The base of real mode software was so great that every version of DOS supports real mode exclusively, even when alternatives arrived.
Real mode programs were designed to run one at a time, with no multi-tasking. XMS only allowed data (text, sounds, graphics, etc.) to be stored beyond the first megabyte of RAM; it could not execute code stored in that space. The 286 tried to resolve both problems by introducing a 24-bit "protected mode".
To ensure compatibility, the 286 CPU would enter real mode when it was powered up, and the operating system could set the CPU to protected mode to take advantage of the 286's extra capabilities. Since real mode was the only mode prior to the 286, real mode got its name when the 286 came along.
The idea behind the protected memory model was that the operating system could reserve memory for a specific program so that no other program could access or overwrite that memory space, thus protecting programs from each other. This would allow multi-tasking -- that is, the use of multiple programs at the same time. Protected mode also enabled 24-bit addressing so that all 16 MB of RAM could be accessed by software for whatever purpose they wanted. The problem was that the 286 was still using 16-bit segments, so only 64 KB of RAM could be accessed at a time. It also had to reboot to return to real mode. This was never a popular solution.
In 1986, Intel released the first 32-bit x86 CPU, the 80386. The 386 extended the general purpose registers from 16 to 32-bits, but continued to allow the lower 16-bits to be addressed by software written for the 8086 through 80286, and the 16-bit registers could still be broken into 8-bit registers for software ported from the 8080. The 386 also added 32-bit memory addresses, allowing it to access 4 GB of memory. The segment sizes were also increased to 32-bit, allowing all 4 GB to be addressed without the need to switch between multiple segments. This was presumed to be enough to future-proof the architecture for a very long time, and it was. Until the 2010s, very few PCs had more than 4 GB of RAM.
32-bit addresses could only be used in protected mode, and protected mode finally allowed significantly greater memory access than Real mode. Computers could not only run multiple programs simultaneously, but they finally had enough memory to actually do it!
The 24-bit, segmented protected mode of the 286 came to be known as standard mode, and the full 32-bit memory model came to be known as protected mode. Both were protected but, in practice, only the 32-bit protected mode of 386 and newer CPUs is referred to as protected mode.
On 22 April 2003, the first 64-bit x86 CPU, the Opteron, was released, followed by the desktop variant, the Athlon 64, on 23 September 2003. Intel later released 64-bit Xeon and Pentium 4 processors in 2004. Known as x86-64 processors, or just x64, these processors extended the general purpose registers to 64-bits, but they are still sub-divided into 32, 16, and 8-bit registers for backwards compatibility. They can also still operate in real or protected mode.
64-bit x86 CPUs can also support 40-bit memory addresses, which allows 1 terabyte of RAM. The architecture also supports expanding memory addresses to 56-bits in the future, to allow up to 4 petabytes of RAM. To access the 64-bit registers and use 40-bit memory addresses, the operating system needs to run in long mode. Long mode is a protected memory model, but differs from the memory model known as protected mode by supporting 64-bit registers and 40-bit memory addresses.
The downside is that it's not possible to enter Virtual 8086 Mode from Long Mode. Just as before, Long Mode CPUs initially boot up as 16-bit CPUs, until the operating system tells them to enter Long Mode. Once a 64-bit operating system is running, the CPU can't enter Virtual 8086 Mode to run 16-bit Real Mode software. If you want to run 16-bit software from a 64-bit OS, you'll need to run special software that allows it to be run in Long Mode. Some examples are DOSBox, DOSEmu and Wine. Alternatively, you can run a 16-bit operating system in an emulator or virtualization suite, such as VirtualBox, VMWare or Virtual PC.