x86 and x86-64 memory models

03 Nov 2019 - tsp
Last update 03 Nov 2019
Reading time 8 mins

Real mode (16 Bit, segmented)

This mode that’s present since the old days of the 8086 is still the mode processors are booting in. At least BIOS/EFI code runs in it and when booting in legacy mode also bootloaders are running in 16 bit mode.

Adresses are calculated as a pair of segment address and offset. A segment address is just a 16 bit value inside one of the segment registers (CS, DS, ES, FS, GS, SS) that simply gets multiplied by 16 - then the offset is added. So

physicalAddress = (segment << 4) + offset

This of course means that there is an ambiguity in addressing memory locations. The address 0000:0010 adresses the same location as 0001:0000. Note that each segment is exactly 65536 bytes long (i.e. 64 KByte). This also has been the limit for executables in old CP/M and MS-DOS .com file format.

The segments in use are:

The codesegment CS that’s used in conjunction with the instruction pointer IP to address the next instruction to be fetched and executed. Normally CS is just updated by either a far-call, far-jump, interrupt or interrupt return.
The datasegment DS that’s used when accessing memory without segment override except for some string functions that are using ES as destination and since the 80486 the additional segments FS and GS that might be used to access some memory locations using overrides.
The stack segment SS that’s used in conjunction with the stack pointer SP to provide a downwards growing stack.

There have been some differnt definition for memory models used by compilers in this mode:

tiny model used a single overlapping 64 KByte data and code area (i.e. CS equal to DS, ES, FS and GS)
small model used one data segment and one code segment (non overlapping).
compact model had a single code but multiple data segments
medium used multiple code and a single data segment
large model used multiple code and multiple data segments that are swapped as required.
huge was the large memory model but single data structures have been larger than a segment and crossed segment boundaries.

These memory models have been important for compiler developers. On current compilers most of the time it’s expected that code, data and especially stack segments overlap - on some machines code might be separate but stack and data is assumed to be flat. This is required to intermix local variables that are normally allocated on the stack and data on the heap. Since loading segment registers normally flushes some internal caches and takes longer than simply loading a general purpose register far pointer usage is also minimized by compilers whenever possible.

There are not many modern compilers that are capable of targeting real mode (which is especially a problem when it comes to development of firmware or bootloaders). To circumvent these problems solutions like DOS4GW already switches into protected mode. One major compilers that support 16 bit mode was the Watcom C compiler.

Unreal mode (16 Bit, 4 GB data segment, 16 Bit code segment)

This is an undocumented 16 bit mode. It’s entered by first entering 32 bit protected mode, loading a 4 GB spawning 32 bit data segment into DS,ES,FS or GS (or all of them) and switching back to real mode. This keeps the large data segments in place and allows (using 32 bit address overrides) to access the whole 32 bit memory space from 16 bit code. This mode is called that way since it has been used in the popular Unreal game.

In the beginning this behaviour was a bug that has been exploited by developers but since it got somewhat popular it was also implemented on modern processors even up until today.

To enter unreal mode:

Load a GDT containing a 32 bit code segment and a 32 bit data segment
Disable interrupts
Enable protected mode in CR0
Load a data register like FS or GS with the 32 bit data segment
Disable protected mode in CR0
Just use the operand size override to access 32 bit memory locations

Protected mode (32 Bit)

This is the 32 bit mode people normally run their 32 bit systems in. The segment registers are just selectors into the global or local descriptor table (GDT / LDT). These descriptors might represent base, limit and a protection level as well as selector type (data, stack or code). Code can only be executed in code segments, etc. Normally modern compilers assume at least overlapping data and stack segments in a single flat address space. This allows them to use simple registers to contain data pointers onto the heap and local variables on the stack - as well as function pointers. There is no need for any register override. Some systems use registers FS and GS to reference into special areas like thread local storage (MS Windows) or a global offset table (Linux).

Code segments (and also data segments) can be assigned one of four protection levels. Ring 0 is traditionally the kernel protection level. Code running in ring 0 can do anything and execute all operations (except when running inside a virtualization hypervisor). Ring 1 and 2 are not used really often and cannot execute privileged operations. Ring 3 is the typical user code level and can also not execute privileged instructions. Switching into other protection levels is normally done via an interrupt or an interrupt return. Also the sysenter/sysleave or syscall/sysret functions might be used for an protection level transition. Transition from a numerical lower level to higher level is also possible via a far jump.

To enter protected mode one has to:

Load a GDT into the GDTR that contains at least a 32 bit code and 32 bit stack segment
Load an IDT into the IDTR that contains interrupt vectors. These vectors contain jump adresses as well as protection level information that selects in which of the 4 available protection rings code should run
Disable interrupts, reprogram the PICs or APICs
Set the PE bit
Jump to the 32 bit code segment using a far jump
Re-enable interrupts

Additionally protected mode supports the usage of paging where an additional indirection layer is introduced. Using paging all virtual adresses can be remapped to other physical adresses using a page table using 4 kByte (default), 2 MByte or 4 MByte pages. Using this mechanisms and page address extensions one can even use a virtual 32 bit adress space to reference into a 36 bit physical address space. Paging also allows to implement virtual memory by providing an pagefault interrupt in case a disabled page is accessed. It also supports two protection levels (ring 0 as supervisor mode and and rings 1-3 as user mode).

Protected mode might also be used segmented with different memory models as described above but since it makes the lives of compiler developers way easier the flat memory model where CS equals DS,ES,FS,GS and SS has become the most frequently used. This of course also requires some additional mechanisms to provide stack overflow bugs (that would be no problem when using a limited stack segment that would not overlap code or data regions - but then one would have to use different pointers for local variables and heap variables / constants when passing through code or perform some special arithmetic in such cases.

Virtual 8086 mode

Since some legacy code might be required on a system running in 32 bit mode there is a special mode, virtual 8086 mode. It runs 16 bit code in a 16 bit code segment under a 32 bit ring 0 implementation. Its sometimes used to run legacy programs on modern operating systems - and also sometimes used in bootloaders like FreeBSD BTX to run 16 bit BIOS code from 32 bit protected mode (one has to take some precautions there because some BIOS routines also do some tricks to access high memory regions).

Long mode (64 Bit)

This is the native 64 bit mode. Its similar to 32 bit mode and one has to switch through 32 bit protected mode to enter 64 bit mode. The main difference is that the flat adressing has been made the only choice for CS,DS and ES. Only FS and GS can use descriptors with a base that’s not zero and a limit that’s not the maximum. One cannot run Virtual 8086 tasks when running the processor in long mode but one can use 32 bit code segments for 32 bit protected mode applications (so a processor running in long mode can concurrently run 32 bit and 64 bit code).

System management mode (SMM)

This is a special mode that’s normally only used by firmware developers. It’s some kind of a special 32 bit protected mode - without any protection. It also triggers some hardware changes like removing mapping of some PCI device address space, etc. The code running in SMM normally hides from “normal” memory regions - they are not accessible in any way from normal 16 bit or 32/64 bit modes. One enters SMM only via triggering of an system management interrupt. Some code running in SMM might be a management engine, thermal control, remote management, etc. Normally one is not capable of injecting code into the SMM since it’s thought of being part of the trusted and unmodifyable base of the system.