Memory management unit

A memory management unit (MMU), sometimes called paged memory management unit (PMMU),

main memory

.

In modern systems, programs generally have addresses that access the theoretical maximum memory of the computer architecture, 32 or 64 bits. The MMU maps the addresses from each program into separate areas in physical memory, which is generally much smaller than the theoretical maximum. This is possible because programs rarely use large amounts of memory at any one time.

Most modern

hard drive if it has been modified since it was read in, read the page from backing storage into that block, and set up the MMU to map the block to the originally requested page so the program can use it. This is known as demand paging. Some simpler real-time operating systems do not support virtual memory and do not need an MMU, but still need a hardware memory protection unit

.

Modern MMUs generally perform additional memory-related tasks as well.

bus arbitration

, controlling access to the memory bus among the many parts of the computer that desire access.

Prior to VM systems becoming widespread in the 1990s, earlier MMU designs were more varied. Common among these was paged translation, which was similar to modern demand paging in that it used fixed-size blocks, but had a fixed-size list of pages that divided up memory; this meant that the block size was a function of the number of pages and the installed memory. Another common technique, found mostly on larger machines, was segmented translation, which allowed for variable-size blocks of memory that better mapped onto program requests. This was efficient but did not map as well onto virtual memory. Some early systems, especially

8-bit systems, used very simple MMUs to perform bank switching

.

Overview

Modern MMUs typically divide the virtual

pages, each having a size which is a power of 2, usually a few kilobytes, but they may be much larger. Programs reference memory using the natural address size of the machine, typically 32 or 64-bits in modern systems. The bottom bits of the address (the offset within a page) are left unchanged. The upper address bits are the virtual page numbers.^[3]

Most MMUs use an in-memory table of items called a

page table entry (PTE) per virtual page, to map virtual page numbers to physical page numbers in main memory. Multi-level page tables are often used to reduce the size of the page table. An associative cache of PTEs is called a translation lookaside buffer (TLB) and is used to avoid the necessity of accessing the main memory every time a virtual address is mapped.^[4]

Other MMUs may have a private array of memory,[5] registers,^[6] or static RAM^[7] that holds a set of mapping information.

The virtual page number may be directly used as an index into the page table or other mapping information, or it may be further divided, with bits at a given level used as an index into a table of lower-level tables into which bits at the next level down are used as an index, with two or more levels of indexing.

The physical page number is combined with the page offset to give the complete physical address.^[3]

A page table entry or other per-page information may also include information about whether the page has been written to (the

supervisor mode) may read and write it, and whether it should be cached.^[8]

Sometimes, a page table entry or other per-page information prohibits access to a particular virtual page, perhaps because no physical

paging). With some MMUs, there can also be a shortage of PTEs, in which case the OS will have to free one for the new mapping.^[8]^[3]

The MMU may also generate illegal access error conditions or

invalid page faults upon illegal or non-existing memory accesses, respectively, leading to segmentation fault or bus error

conditions when handled by the operating system.

Benefits

In some cases, a page fault may indicate a software bug, which can be prevented by using memory protection as one of key benefits of an MMU: an operating system can use it to protect against errant programs by disallowing access to memory that a particular program should not have access to. Typically, an operating system assigns each program its own virtual address space.^[3]

A paged MMU also mitigates the problem of

paging.^[8]^[3]

However, paged mapping causes another problem,
internal fragmentation. This occurs when a program requests a block of memory that does not cleanly map into a page, for instance, if a program requests a 1 KB buffer to perform file work. In this case, the request results in an entire page being set aside even though only 1 KB of the page will ever be used; if pages are larger than 1 KB, the remainder of the page is wasted. If many small allocations of this sort are made, memory can be used up even though much of it remains empty.^[4]

In some early microprocessor designs, memory management was performed by a separate integrated circuit such as the VLSI Technology VI475 (1986), the Motorola 68851 (1984) used with the Motorola 68020 CPU in the Macintosh II, or the Z8010^[9] and Z8015 (1985)^[10]^[11] used with the Zilog Z8000 family of processors. Later microprocessors (such as the Motorola 68030 and the Zilog Z280) placed the MMU together with the CPU on the same integrated circuit, as did the Intel 80286 and later x86 microprocessors.

Older concepts

While this article concentrates on modern MMUs, commonly based on demand paging, early systems used
segmentation, or used a fixed set of blocks instead of loading them on demand. The difference between these two approaches is the size of the contiguous block of memory; paged systems break up main memory into a series of equal sized blocks, while segmented systems generally allow for variable sizes.^[4]

Segmented translation

Early memory management systems, often implemented in software, set aside a portion of memory to hold a series of mappings. These consisted of pairs of values, the base and limit, although many other terms have been used. When the operating system requested memory to load a program, or a program requested more memory to hold data from a file for instance, it would call the memory handling library. This examined the mappings to look for an area in main memory large enough to hold the request. If such a block was found, a new entry was entered into the table. From then on, when that program accessed memory, all of its addresses were offset by the base value. When the program is done with the memory it requested and releases, or the program exits, the entries associated with it are released.^[4]
As an example, consider a popular
core memory. The introduction of semiconductor-based memories led to rapidly falling costs and a corresponding increase in memory capacity. The PDP-11 was modified by adding a base address that was pre-pended to the original 16-bit address to produce a larger 18 or 22-bit address into physical memory. This allowed the system to address 256 KB or 4 MB of RAM, requiring only minor changes to the programs to allow them to set up the base and bounds by sending values to the new PAR registers (page address register) and PDR (page description register), which was typically handled by the operating system.^[12]

This style of access, over time, became known as segmented translation, although a variety of terms are used here as well. This style has the advantage of simplicity; the memory blocks are continuous and thus only the two values, base and limit, need to be stored. Each entry corresponds to a block of memory used by a single program, and the translation is invisible to the program, which sees main memory starting at address zero and extending to some fixed value.[4]
The disadvantage of this approach is that it leads to an effect known as external fragmentation. This occurs when memory allocations are released but are non-contiguous. In this case, enough memory may be available to handle a request, but this is spread out and cannot be allocated. On systems where programs start and stop over time, this can eventually lead to memory being highly fragmented and no large blocks remaining. A number of algorithms were developed to address this problem.^[4]
Segmenting was widely used on
Macintosh platform. Each program was allocated an amount of memory that was pre-selected in the Finder and translation from virtual to physical was accomplished within the programs using handles.^[13]

A more common example is the
x86 architecture
series used different approaches.

Segmentation plus paging

Some systems, such as the GE 645 and its successors, used both segmentation and paging. The table of segments, instead of containing per-segment entries giving the physical base address and length of the segment, contains entries giving the physical base address of a page table for the segment, in addition to the length of the segment. Physical memory is divided into fixed-size pages, and the same techniques used for purely page-based demand paging are used for segment-and-page-based demand paging.

Paged translation

Another approach to memory handling is to break up main memory into a contiguous series of fixed-sized blocks. This is similar to the modern demand paging system in that the result is a series of pages, but in these earlier systems the list of pages is fixed in size and normally stored in some form of fast memory like
static RAM to improve performance. In this case, the two parts of the address stored by the MMU are known as the segment number and page index.^[4]

Consider a processor design with 24-bit addressing, like the original Motorola 68000. In such a system, the MMU splits the virtual address into parts, for instance, the 13 least significant bits for the page index and the remaining 11 most significant bits as the segment number. This results a list of 2048 pages of 8 kB each.^[4] In this approach, memory requests result in one or more pages being granted to that program, which may not be contiguous in main memory. The MMU maintains a list of page number originally expressed by the program and the actual page number in main memory. When it attempts to access memory, the MMU reads the segment number from the processor's memory bus, finds the corresponding entry for that program in its internal memory, and expresses the mapped version of the value on the memory's bus while the lower bits of the original address are passed through unchanged. Like the segmented case, programs see its memory as a single contiguous block.^[4]
There are two disadvantages to this approach. The first is that as the virtual address space expands, the amount of memory needed to hold the mapping increases as well. For instance, in the
68020 the addresses are 32-bits wide, meaning the segment number for the same 8 kB page size is now the upper 19 bits and the mapping table expands to 512 kB in size,^[4] far beyond what could be implemented in hardware for reasonable cost in the 1980s. This problem can be reduced by making the pages larger, say 64 kB instead of 8. Now the page index uses 16 bits and the resulting page table is 64 kB, which is more tractable. Moving to a larger page size leads to the second problem, increased internal fragmentation. A program that generates a series of requests for small block will be assigned large blocks and thereby waste large amounts of memory.^[4]

The paged translation approach was widely used by microprocessor MMUs in the 1970s and early 80s, including the Zilog Z8000's Z8010 and the Signetics 68905 (which could operate in either mode). Both Signetics and Philips produced a version of the 68000 that combined the 68905 on the same physical chip as the processor, the 68070.^[4]
Another use of this technique is to expand the size of the physical address when the virtual address is too small. For instance, the
processor bus to indicate which program was accessing memory.^[15]

Another use of this same technique, although not referred to as paging but
Atari 130XE to 128 kB.^[16] The Commodore 128
used a similar approach.

Examples

Most modern systems divide memory into pages that are 4–64
trap
into the OS when a page translation is not found in the TLB. Most systems use a hardware-based tree walker. Most systems allow the MMU to be disabled, but some disable the MMU when trapping into OS code.

IBM System/360 Model 67, IBM System/370, and successors

The
paravirtualization
, easier.
Starting in August, 1972, the
64-bit z/Architecture
was introduced, with the address space expanded to 64 bits; those continue to store the accessed and dirty bits outside the page table.

VAX

VAX pages are 512 bytes,^[19]^: 199 which is very small. An OS may treat multiple pages as if they were a single larger page. For example, Linux on VAX groups eight pages together. Thus, the system is viewed as having 4 KB pages. The VAX divides memory into four fixed-purpose regions, each 1 GB in size. They are:^[19]^{: 200–201}

P0 space

Used for general-purpose per-process memory such as heaps.

P1 space

(Or control space) which is also per-process and is typically used for supervisor, executive, kernel, user stacks and other per-process control structures managed by the operating system.

S0 space

(Or system space) which is global to all processes and stores operating system code and data, whether paged or not, including pagetables.

S1 space

Which is unused and "Reserved to Digital".^[19]^{: 200–201}

Page tables are big linear arrays.
tree, allowing applications to have sparse memory layout without wasting a lot of space on unused page table entries. Unlike page table entries in most MMUs, page table entries in the VAX MMU lack an accessed bit.^[19]
^{: 203–205} OSes which implement paging must find some way to emulate the accessed bit if they are to operate efficiently. Typically, the OS will periodically unmap pages so that page-not-present faults can be used to let the OS set an accessed bit.

ARM

PTEs
for describing 4 KB and 64 KB pages, 1 MB sections and 16 MB super-sections; legacy versions also defined a 1 KB tiny page. ARM uses a two-level page table if using 4 KB and 64 KB pages, or just a one-level page table for 1 MB sections and 16 MB sections.
TLB updates are performed automatically by page table walking hardware. PTEs include read/write access permission based on privilege, cacheability information, an NX bit, and a non-secure bit.^[20]

DEC Alpha

DEC Alpha processors divide memory into 8 KB, 16 KB, 32 KB, or 64 KB; the page size is dependent on the processor.^[21]^: 3–2^[22]^: 3–2 pages. After a TLB miss, low-level firmware machine code (here called PALcode) walks a page table.
The
DEC OSF/1 PALcode walk a three-level tree-structured page table. Addresses are broken down into an unused set of bits (containing the same value as the uppermost bit of the index into the root level of the tree), a set of bits to index the root level of the tree, a set of bits to index the middle level of the tree, a set of bits to index the leaf level of the tree, and remaining bits that pass through to the physical address without modification, indexing a byte within the page. The sizes of the fields are dependent on the page size; all three tree index fields are the same size.^[21]^{: 3-2–3-3}^[22]^{: 3-1–3-2} The OpenVMS AXP PALcode supports full read and write permission bits for user, supervisor, executive, and kernel modes, and also supports fault on read/write/execute bits are also supported.^[21]^{: 3-3–3-6} The DEC OSF/1 PALcode supports full read and write permission bits for user and kernel modes, and also supports fault on read/write/execute bits are also supported.^[22]
^{: (II-B) 3-3-3-6}
The Windows NT AXP PALcode can either walk a single-level page table in a virtual address space or a two-level page table in physical address space. The upper 32 bits of an address are ignored. For a single-level page table, addresses are broken down into a set of bits to index the page table and remaining bits that pass through to the physical address without modification, indexing a byte within the page. For a two-level page table, addresses are borken down into a set of bits to index the root level of the tree, a set of bits to index the top level of the tree, a set of bits to index the leaf level of the tree, and remaining bits that pass through to the physical address without modification, indexing a byte within the page. The sizes of the fields are dependent on the page size.^[23]^{: 3-2–3-4} The Windows NT AXP PALcode supports a page being accessible only from kernel mode or being accessible from user and kernel mode, and also supports a fault on write bit.^[23]^: 3-5

MIPS

This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed. (June 2023) (Learn how and when to remove this message)

The MIPS architecture supports one to 64 entries in the TLB. The number of TLB entries is configurable at CPU configuration before synthesis. TLB entries are dual. Each TLB entry maps a virtual page number (VPN2) to either one of two page frame numbers (PFN0 or PFN1), depending on the least significant bit of the virtual address that is not part of the page mask. This bit and the page mask bits are not stored in the VPN2. Each TLB entry has its own page size, which can be any value from 1 KB to 256 MB in multiples of four. Each PFN in a TLB entry has a caching attribute, a dirty and a valid status bit. A VPN2 has a global status bit and an OS assigned ID which participates in the virtual address TLB entry match, if the global status bit is set to zero. A PFN stores the physical address without the page mask bits.
A TLB refill exception is generated when there are no entries in the TLB that match the mapped virtual address. A TLB invalid exception is generated when there is a match but the entry is marked invalid. A TLB modified exception is generated when a store instruction references a mapped address and the matching entry's dirty status is not set. If a TLB exception occurs when processing a TLB exception, a double fault TLB exception, it is dispatched to its own
exception handler
.
MIPS32 and MIPS32r2 support 32 bits of virtual address space and up to 36 bits of physical address space. MIPS64 supports up to 64 bits of virtual address space and up to 59 bits of physical address space.

Sun MMU

The original
I/O
and the Multibus I/O runs through the MMU, where address translation and protection are done in a uniform fashion. The MMU is implemented in hardware on the CPU board.
The MMU consists of a context register, a segment map and a page map. Virtual addresses from the CPU are translated into intermediate addresses by the segment map, which in turn are translated into physical addresses by the page map. The page size is 2 KB and the segment size is 32 KB which gives 16 pages per segment. Up to 16 contexts can be mapped concurrently. The maximum logical address space for a context is 1024 pages or 2 MB. The maximum physical address that can be mapped simultaneously is also 2 MB.
The context register is important in a multitasking operating system because it allows the CPU to switch between processes without reloading all the translation state information. The 4-bit context register can switch between 16 sections of the segment map under supervisor control, which allows 16 contexts to be mapped concurrently. Each context has its own virtual address space. Sharing of virtual address space and inter-context communications can be provided by writing the same values in to the segment or page maps of different contexts. Additional contexts can be handled by treating the segment map as a context cache and replacing out-of-date contexts on a least-recently used basis.
The context register makes no distinction between user and supervisor states. Interrupts and traps do not switch contexts, which requires that all valid interrupt vectors always be mapped in page 0 of context, as well as the valid supervisor stack.^[24]
The Sun-2 workstations are similar; they are built around the Motorola 68010 microprocessor and have a similar memory management unit, with 2 KB pages and 32 KB segments. The context register has a 3-bit system context used in supervisor state and a 3-bit user context used in user state.^[25]
The Sun-3 workstations, except for the Sun-3/80, Sun-3/460, Sun-3/470, and Sun-3/480, are built around the Motorola 68020, and have a similar memory management unit. The page size is increased to 8 KB. (The later models are built around the Motorola 68030 and use the 68030's on-chip MMU.)
The Sun-4 workstations are built around various SPARC microprocessors, and have a memory management unit similar to that of the Sun-3 workstations.

PowerPC

This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed. (June 2023) (Learn how and when to remove this message)

In PowerPC G1, G2, G3, and G4 pages are normally 4 KB. After a TLB miss, the standard PowerPC MMU begins two simultaneous lookups. One lookup attempts to match the address with one of four or eight data block address translation (DBAT) registers, or four or eight instruction block address translation registers (IBAT), as appropriate. The BAT registers can map linear chunks of memory as large as 256 MB, and are normally used by an OS to map large portions of the address space for the OS kernel's own use. If the BAT lookup succeeds, the other lookup is halted and ignored.
The other lookup, not directly supported by all processors in this family, is via a so-called
inverted page table, which acts as a hashed off-chip extension of the TLB. First, the top four bits of the address are used to select one of 16 segment registers. Then 24 bits from the segment register replace those four bits, producing a 52-bit address. The use of segment registers allows multiple processes to share the same hash table
.
The 52-bit address is hashed, then used as an index into the off-chip table. There, a group of eight-page table entries is scanned for one that matches. If none match due to excessive hash collisions, the processor tries again with a slightly different hash function. If this, too, fails, the CPU traps into OS (with MMU disabled) so that the problem may be resolved. The OS needs to discard an entry from the hash table to make space for a new entry. The OS may generate the new entry from a more normal tree-like page table or from per-mapping data structures which are likely to be slower and more space-efficient. Support for no-execute control is in the segment registers, leading to 256 MB granularity.
A major problem with this design is poor
cache locality
caused by the hash function. Tree-based designs avoid this by placing the page table entries for adjacent pages in adjacent locations. An operating system running on the PowerPC may minimize the size of the hash table to reduce this problem.
It is also somewhat slow to remove the page table entries of a process. The OS may avoid reusing segment values to delay facing this, or it may elect to suffer the waste of memory associated with per-process hash tables. G1 chips do not search for page table entries, but they do generate the hash, with the expectation that an OS will search the standard hash table via software. The OS can write to the TLB. G2, G3, and early G4 chips use hardware to search the hash table. The latest chips allow the OS to choose either method. On chips that make this optional or do not support it at all, the OS may choose to use a tree-based page table exclusively.

x86

The x86 architecture has evolved over a very long time while maintaining full software compatibility, even for OS code. Thus, the MMU is extremely complex, with many different possible operating modes.
The
80186
/80188 have no memory management unit; they support segmentation, but only to support more physical memory than a 16-bit address can support, as the segment number, in a segment register, is multiplied by 16 and added to the segment offset to generate a physical address.
The
80286 added an MMU that supports segmentation, but not paging. When segmentation is enabled by turning on protected mode, the segment number acts as an index into a table of segment descriptors
; a segment descriptor contains a base physical address, a segment length, a presence bit to indicate whether the segment is currently in memory, permission bits, and control bits, If the offset in the segment is within the bounds specified by the segment descriptor, that offset is added to the base physical address to generate a physical address.
The
80386, which introduced the 32-bit IA-32
version of x86, and subsequent x86 CPUs, support segmentation and paging. If paging is enabled, the base address in a segment descriptor is an address in a linear paged address space divided into 4 KB pages, so when that is added to the offset in the segment, the resulting address is a linear address in that address space; in IA-32, that address is then masked to be no larger than 32 bits. The result may be looked up via a tree-structured page table, with the bits of the address being split as follows: 10 bits for the branch of the tree, 10 bits for the leaves of the branch, and the 12 lowest bits being directly copied to the result.
Segment registers, used in pre-80386 CPUs to extend the address space, are not used in modern OSes, with one major exception: access to
PaX
patches, may also limit the length of the code segment, as specified by the CS register, to disallow execution of code in modifiable regions of the address space.
Minor revisions of the MMU introduced with the
physical address extension (PAE) feature, enabling 36-bit physical addresses with 2+9+9 bits for three-level page tables and 12 lowest bits being directly copied to the result. Large pages (2 MB) are also available by skipping the bottom level of the tree (resulting in 2+9 bits for two-level table hierarchy and the remaining 9+12 lowest bits copied directly). In addition, the page attribute table
allowed specification of cacheability by looking up a few high bits in a small on-CPU table.
PaX
mechanisms described above emulate per-page non-execute support on machines x86 processors lacking the NX bit by setting the length of the code segment, with a performance loss and a reduction in the available address space.

Heterogeneous System Architecture (HSA) creates a unified virtual address space for CPUs, GPUs and DSPs, obsoleting the mapping tricks and data copying.

x86-64, the 64-bit version of the x86 architecture, almost entirely removes segmentation in favor of the flat memory model used by almost all operating systems for the 386 or newer processors. In long mode, all segment offsets are ignored, except for the FS and GS segments; linear addresses are 64-bit rather than 32-bit, with the lowest 48 bits of the address being significant. When used with 4 KB pages, the page table tree has four levels instead of three, to handle the larger linear addresses; in some newer x86-64 processors, a fifth page table level can be enabled, to support 57-bit linear addresses. In all levels of the page table, the page table entry includes an NX bit.
48-bit linear addresses are divided as follows: 16 bits unused, nine bits each for four tree levels (for a total of 36 bits), and the 12 lowest bits directly copied to the result. With 2 MB pages, there are only three levels of page table, for a total of 27 bits used in paging and 21 bits of offset. Some newer CPUs also support a 1 GB page with two levels of paging and 30 bits of offset.^[26] CPUID can be used to determine if 1 GB pages are supported.
In all three cases, the 16 highest bits are required to be equal to the 48th bit, or in other words, the low 48 bits are sign extended to the higher bits. This is done to allow further expansion of the addressable range, without compromising backwards compatibility.
57-bit linear addresses are divided as follows: 7 bits unused, nine bits each for five tree levels (for a total of 45 bits), and the 12 lowest bits directly copied to the result. The low 57 bits are sign extended.

Alternatives

Burroughs B5000 series, B6x00/B7x00/B5900/A-series/Unisys MCP systems

The
Atlas), has no need for an external MMU.^[27]
The B5000 and its successors up to current Unisys ClearPath MCP (Libra) systems provide the two functions of an MMU — virtual memory addresses and memory protection — with a different architectural approach. Rather than add virtual memory onto a processor not designed for virtual memory, it is integrated in the core design of the processor/system so there is no requirement for an external unit to add functionality of address translation or bounds checking, resulting in unprecedented memory safety and security.
First, in the mapping of virtual memory addresses, instead of needing an MMU, the machines are descriptor-based.^[28]^[29] The uppermost bit of a 48-bit memory word in the Burroughs B5000 series systems indicate whether the word is an user data or a descriptor/control word; descriptors were read-only to user processes and could only be updated by the system (hardware or operating system). Memory words in the B6x00/B7x00/B5900/A-series/Unisys MCP systems have 48 data bits and 3 tag bits (and later systems to current have 4 tag bits) words whose tag is an odd number are read-only to user processes – descriptors have a tag of 5 and code words have a tag of 3.
Each allocated memory block is given a master descriptor with the properties of the block — physical address, size, and whether it is present in main memory or not, in which case on first access the block must be allocated, or if there is an address, it is an address on secondary storage and must be loaded to main memory. All references to code and data are made using a descriptor. When a request is made to access the block for reading or writing, the hardware checks its presence via the presence bit (pbit) in the descriptor.
A pbit^[30] of 1 indicates the presence of the block. In this case, the block can be accessed via the physical main-memory address in the descriptor. If the pbit is zero, a pbit interrupt^[31] is generated for the MCP (operating system) to make the block present. If the address field is zero, this is the first access to this block, and it is allocated (an init (initial) pbit). If the address field is non-zero, it is a disk address of the block, which has previously been rolled out — the block is fetched from disk, the pbit is set to one and the physical memory address updated to point to the block in memory. This makes descriptors equivalent to a page-table entry in an MMU system, but descriptors are free of a table.
System performance can be monitored through the number of pbit interrupts, which is the count of accesses to descriptors with a zero bit. 'Init' pbits indicate initial allocations or new memory blocks, but a high level of 'other' pbits indicate that the system had to load a block from virtual memory secondary storage. In normal operation this will not occur very much. If there are above average number of other pbits occurring the system might be thrashing, indicating system overload. In production systems, the machine load will be more constant and thus the system configured or tuned to avoid thrashing.
All memory allocation is therefore completely automatic (one of the features of modern systems
lazy
, since a block will not be allocated until it is actually referenced. When memory is nearly full, the MCP examines the working set, trying compaction (since the system is segmented, not paged), deallocating read-only segments (such as code-segments which can be restored from their original copy) and, as a last resort, rolling dirty (that is updated) data segments out to disk.
Another way these systems provide a function of a MMU is in protection. Since all accesses are via the descriptor, the hardware can check that all accesses are within bounds and, in the case of a write, that the process has write permission. The MCP system is inherently secure and thus has no need of an MMU to provide this level of memory protection.
Blocks can be shared between processes via copy descriptors in process stacks (a stack is a special system structure representing the execution state of a process). Thus, some processes may have write permission, whereas others do not. Code segments are read-only, thus reentrant and shared between processes. Copy descriptors contain a 20-bit address field giving index of the master descriptor in the master descriptor array. This also implements a very efficient and secure Interprocess Communication (IPC) mechanism. Blocks can easily be relocated, since only the master descriptor needs update when a block's status changes.
Another aspect is performance – do MMU-based or non-MMU-based systems provide better performance? MCP systems may be implemented on top of standard hardware that does have an MMU (for example, a standard PC). Even if the system implementation uses the MMU in some way, this will not be at all visible at the MCP level.

See also

Burroughs large systems descriptors

Memory controller

Memory protection unit

References

^ Memory Management Unit at the Free On-line Dictionary of Computing

ISBN 978-0-13-600663-3
.

^ ^a ^b ^c ^d ^e Frank Uyeda (2009). "Lecture 7: Memory Management" (PDF). CSE 120: Principles of Operating Systems. UC San Diego. Retrieved 2013-12-04.

^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m Zehr, Gregg (November 1986). "Memory Management Units For 68000 Architectures" (PDF). Byte. pp. 127–135.

^ Spectra 70 70-46 Processor Manual (PDF). RCA. March 1968. pp. 5–6. Retrieved August 15, 2013.

^ Reference Manual, SDS 940 Computer (PDF). Scientific Data Systems. 1966. pp. 8–10.

^ Hardware Engineering Manual for the 2060 CPU Board (PDF). Sun Microsystems. 10 May 1987. pp. 121–129.

^ ^a ^b ^c This article is based on material taken from Memory+management+unit at the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.

^ "Z8010 Z8000 MMU Memory Management Unit Product Specification" (PDF). Zilog. April 1985.

^ "1983/84 Data Book" (PDF). Zilog. pp. 215–234. Retrieved 2021-04-27.

^ Schmidt, Stephen (April 1983). "Virtual Memory for Microcomputers". Byte. Vol. 8, no. 4. pp. 234–235.

^ "6". PDP-11/70 Processor Handbook (PDF). DEC. 1977.

^ MultiFinder User's Guide. Apple. 1987. pp. 20–22, 24.

^ Memory Address Decoding (Technical report). University of New Mexico. 1996.

^ Bell, Gordon. "A Retrospective on What We Have Learned From the PDP-11".

^ MCU, Memory Control Unit, FREDDIE (Technical report). Atari. 1983.

^ "IBM Archives: System/360 Dates and characteristics". 03.ibm.com. 23 January 2003. Archived from the original on January 16, 2005. Retrieved 2017-05-03.

^ ^a ^b "IBM System/360 Model 67 Functional Characteristics, Third Edition" (PDF). February 1972. GA27-2719-2. Retrieved October 29, 2021.

^
ISBN 0-932376-86-X
. EY-3459E-DP.

^ "Cortex-A8 Technical Reference Manual" (PDF). Infoventer.arm.com. Retrieved 2017-05-03.

^
ISBN 1-55558-145-5
. EY-TI32E-DP.

^
ISBN 1-55558-145-5
. EY-TI32E-DP.

^
ISBN 1-55558-145-5
. EY-TI32E-DP.

^ Sun 68000 Board User's Manual, Sun Microsystems, Inc, February 1983, Revision B

^ "Sun-2 Architecture Manual" (PDF). Sun Microsystems. 15 December 1983. pp. 9–12.

^ "AMD64 Architecture Programmer's Manual Volume 2: System Programming" (PDF). March 2017. Retrieved 2017-12-05.

S2CID 99779
.

^ "The Descriptor" (PDF). Bitsavers.

^ "Unisys Descriptor". Unisys.

^ "Pbits". Unisys.

^ "Pbit interrupt". Unisys.

Byte Magazine. Archived from the original
on 2007-09-27.

v
t
e
Processor technologies
Models

Abstract machine

Stored-program computer

Finite-state machine
with datapath

Hierarchical

Deterministic finite automaton

Queue automaton

Cellular automaton

Quantum cellular automaton

Turing machine
Alternating Turing machine

Universal

Post–Turing

Quantum

Nondeterministic Turing machine

Probabilistic Turing machine

Hypercomputation

Zeno machine

Belt machine

Stack machine

Register machines
Counter

Pointer

Random-access

Random-access stored program

Architecture

Microarchitecture

Von Neumann

Harvard
modified

Dataflow

Transport-triggered

Cellular

Endianness

Memory access
NUMA

HUMA

Load–store

Register/memory

Cache hierarchy

Memory hierarchy
Virtual memory

Secondary storage

Heterogeneous

Fabric

Multiprocessing

Cognitive

Neuromorphic

Instruction set
architectures
Types

Orthogonal instruction set

CISC

RISC

Application-specific

EDGE
TRIPS

VLIW
EPIC

MISC

OISC

NISC

ZISC

VISC architecture

Quantum computing

Comparison
Addressing modes

Instruction
sets

Motorola 68000 series

VAX

PDP-11

x86

ARM

Stanford MIPS

MIPS

MIPS-X

Power
POWER

PowerPC

Power ISA

Clipper architecture

SPARC

SuperH

DEC Alpha

ETRAX CRIS

M32R

Unicore

Itanium

OpenRISC

RISC-V

MicroBlaze

LMC

System/3x0
S/360

S/370

S/390

z/Architecture

Tilera ISA

VISC architecture

Epiphany architecture

Others

Execution
Instruction pipelining

Pipeline stall

Operand forwarding

Classic RISC pipeline

Hazards

Data dependency

Structural

Control

False sharing

Out-of-order

Scoreboarding

Tomasulo's algorithm
Reservation station

Re-order buffer

Register renaming

Wide-issue

Speculative

Branch prediction

Memory dependence prediction

Parallelism
Level

Bit
Bit-serial

Word

Instruction

Pipelining
Scalar

Superscalar

Task
Thread

Process

Data
Vector

Memory

Distributed

Multithreading

Temporal

Simultaneous
Hyperthreading

Simultaneous and heterogenous

Speculative

Preemptive

Cooperative

Flynn's taxonomy

SISD

SIMD
Array processing (SIMT)

Pipelined processing

Associative processing

SWAR

MISD

MIMD
SPMD

Processor
performance

Transistor count

Instructions per cycle (IPC)
Cycles per instruction (CPI)

Instructions per second (IPS)

Floating-point operations per second
(FLOPS)

Transactions per second (TPS)

Synaptic updates per second (SUPS)

Performance per watt (PPW)

Cache performance metrics

Computer performance by orders of magnitude

Types

Central processing unit (CPU)

Graphics processing unit (GPU)
GPGPU

Vector

Barrel

Stream

Tile processor

Coprocessor

PAL

ASIC

FPGA

FPOA

CPLD

Multi-chip module (MCM)

System in a package (SiP)

Package on a package (PoP)

By application

Embedded system

Microprocessor

Microcontroller

Mobile

Ultra-low-voltage

ASIP

Soft microprocessor

Systems
on chip

System on a chip (SoC)

Multiprocessor
(MPSoC)

Cypress PSoC

Network on a chip (NoC)

Hardware
accelerators

Coprocessor

AI accelerator

Graphics processing unit (GPU)

Image processor

Vision processing unit (VPU)

Physics processing unit (PPU)

Digital signal processor (DSP)

Tensor Processing Unit (TPU)

Secure cryptoprocessor

Network processor

Baseband processor

Word size

1-bit

4-bit

8-bit

12-bit

15-bit

16-bit

24-bit

32-bit

48-bit

64-bit

128-bit

256-bit

512-bit

bit slicing

others
variable

Core count

Single-core

Multi-core

Manycore

Heterogeneous architecture

Components

Core

Cache
CPU cache

Scratchpad memory

Data cache

Instruction cache

replacement policies

coherence

Bus

Clock rate

Clock signal

FIFO

Functional
units

Arithmetic logic unit (ALU)

Address generation unit (AGU)

Floating-point unit (FPU)

Memory management unit (MMU)
Load–store unit

Translation lookaside buffer (TLB)

Branch predictor

Branch target predictor

Integrated memory controller (IMC)
Memory management unit

Instruction decoder

Logic

Combinational

Sequential

Glue

Logic gate
Quantum

Array

Registers

Processor register

Status register

Stack register

Register file

Memory buffer

Memory address register

Program counter

Control unit

Hardwired control unit

Instruction unit

Data buffer

Write buffer

Microcode ROM

Counter

Datapath

Multiplexer

Demultiplexer

Adder

Multiplier
CPU

Binary decoder
Address decoder

Sum-addressed decoder

Barrel shifter

Circuitry

Integrated circuit
3D

Mixed-signal

Power management

Boolean

Digital

Analog

Quantum

Switch

Power
management

PMU

APM

ACPI

Dynamic frequency scaling

Dynamic voltage scaling

Clock gating

Performance per watt (PPW)

Related

History of general-purpose CPUs

Microprocessor chronology

Processor design

Digital electronics

Hardware security module

Semiconductor device fabrication

Tick–tock model

Pin grid array

Chip carrier

v
t
e
Operating systems
General

Comparison

Forensic engineering

History

List

Timeline

Usage share

User features comparison

Variants

Disk operating system

Distributed operating system

Embedded operating system

Hobbyist operating system

Just enough operating system

Mobile operating system

Network operating system

Object-oriented operating system

Real-time operating system

Supercomputer operating system

Kernel
Architectures

Exokernel

Hybrid

Microkernel

Monolithic

Multikernel

vkernel

Rump kernel

Unikernel

Components

Device driver

Loadable kernel module

User space and kernel space

Process management
Concepts

Computer multitasking (Cooperative, Preemptive)

Context switch

Interrupt

IPC

Process

Process control block

Real-time

Thread

Time-sharing

Scheduling
algorithms

Fixed-priority preemptive

Multilevel feedback queue

Round-robin

Shortest job next

Memory management,
resource protection

Bus error

General protection fault

Memory paging

Memory protection

Protection ring

Segmentation fault

Virtual memory

Storage access,
file systems

Boot loader

Defragmentation

Device file

File attribute

Inode

Journal

Partition

Virtual file system

Virtual tape library

Supporting concepts

API

Computer network

HAL

Live CD

Live USB

Shell
CLI

User interface

PXE

v
t
e
Memory management

Memory management as a function of an operating system

Hardware

Memory management unit (MMU)

Translation lookaside buffer (TLB)

Input–output memory management unit (IOMMU)

Virtual memory

Demand paging

Memory paging

Page table

Virtual memory compression

Memory segmentation

Protected mode

Real mode

Virtual 8086 mode

x86 memory segmentation

Memory allocator

dlmalloc

Hoard

jemalloc

libumem

mimalloc

ptmalloc

Manual memory management

Static memory allocation

C dynamic memory allocation

new and delete (C++)

Garbage collection

Automatic Reference Counting

Boehm garbage collector

Cheney's algorithm

Concurrent mark sweep collector

Finalizer

Garbage

Garbage-first collector

Mark–compact algorithm

Reference counting

Tracing garbage collection

Strong reference

Weak reference

Memory safety

Buffer overflow

Buffer over-read

Dangling pointer

Stack overflow

Issues

Fragmentation

Memory leak

Unreachable memory

Other

Automatic variable

International Symposium on Memory Management

Region-based memory management

Memory pool

Memory management

Virtual memory

Automatic memory management

Memory management algorithms

Memory management software

Retrieved from "https://en.wikipedia.org/w/index.php?title=Memory_management_unit&oldid=1284779047"

[1] Memory Management Unit at the Free On-line Dictionary of Computing

[TanenMOS-2] ISBN 978-0-13-600663-3
.

[ucsd-lecture-3] Frank Uyeda (2009). "Lecture 7: Memory Management" (PDF). CSE 120: Principles of Operating Systems. UC San Diego. Retrieved 2013-12-04.

[Zehr-4] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m Zehr, Gregg (November 1986). "Memory Management Units For 68000 Architectures" (PDF). Byte. pp. 127–135.

[5] Spectra 70 70-46 Processor Manual (PDF). RCA. March 1968. pp. 5–6. Retrieved August 15, 2013.

[6] Reference Manual, SDS 940 Computer (PDF). Scientific Data Systems. 1966. pp. 8–10.

[7] Hardware Engineering Manual for the 2060 CPU Board (PDF). Sun Microsystems. 10 May 1987. pp. 121–129.

[FOLDOC-8] This article is based on material taken from Memory+management+unit at the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.

[9] "Z8010 Z8000 MMU Memory Management Unit Product Specification" (PDF). Zilog. April 1985.

[10] "1983/84 Data Book" (PDF). Zilog. pp. 215–234. Retrieved 2021-04-27.

[11] Schmidt, Stephen (April 1983). "Virtual Memory for Microcomputers". Byte. Vol. 8, no. 4. pp. 234–235.

[12] "6". PDP-11/70 Processor Handbook (PDF). DEC. 1977.

[13] MultiFinder User's Guide. Apple. 1987. pp. 20–22, 24.

[14] Memory Address Decoding (Technical report). University of New Mexico. 1996.

[15] Bell, Gordon. "A Retrospective on What We Have Learned From the PDP-11".

[16] MCU, Memory Control Unit, FREDDIE (Technical report). Atari. 1983.

[17] "IBM Archives: System/360 Dates and characteristics". 03.ibm.com. 23 January 2003. Archived from the original on January 16, 2005. Retrieved 2017-05-03.

[IBM-S360-67-FuncChar-18] "IBM System/360 Model 67 Functional Characteristics, Third Edition" (PDF). February 1972. GA27-2719-2. Retrieved October 29, 2021.

[vax-ref-manual-19] 
ISBN 0-932376-86-X
. EY-3459E-DP.

[20] "Cortex-A8 Technical Reference Manual" (PDF). Infoventer.arm.com. Retrieved 2017-05-03.

[alpha-axp-arch-manual-openvms-21] 
ISBN 1-55558-145-5
. EY-TI32E-DP.

[alpha-axp-arch-manual-osf1-22] 
ISBN 1-55558-145-5
. EY-TI32E-DP.

[alpha-axp-arch-manual-nt-23] 
ISBN 1-55558-145-5
. EY-TI32E-DP.

[24] Sun 68000 Board User's Manual, Sun Microsystems, Inc, February 1983, Revision B

[25] "Sun-2 Architecture Manual" (PDF). Sun Microsystems. 15 December 1983. pp. 9–12.

[amd-volume2-26] "AMD64 Architecture Programmer's Manual Volume 2: System Programming" (PDF). March 2017. Retrieved 2017-12-05.

[27] S2CID 99779
.

[MCP_Descriptor-28] "The Descriptor" (PDF). Bitsavers.

[Unisys-descriptor-29] "Unisys Descriptor". Unisys.

[pbit-30] "Pbits". Unisys.

[Pbit-interrupt-31] "Pbit interrupt". Unisys.

[32] Byte Magazine. Archived from the original
on 2007-09-27.

[2]

[3]

[4]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[15]

[16]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]