x86

x86 (also known as 80x86

80486

processors. Colloquially, their names were "186", "286", "386" and "486".

The term is not synonymous with

IBM PC

(1981) debut.

As of June 2022^[update], most

AMD Epyc CPUs based on the x86 ISA; it broke the 1 exaFLOPS barrier in May 2022.^[7]

Overview

In the 1980s and early 1990s, when the

80386

in 1985.

A few years after the introduction of the 8086 and 8088, Intel added some complexity to its naming scheme and terminology as the "iAPX" of the ambitious but ill-fated

8089, and simpler Intel-specific system chips,^[d] was thereby described as an iAPX 86 system.^[8]^[e] There were also terms iRMX (for operating systems), iSBC (for single-board computers), and iSBX (for multimodule boards based on the 8086-architecture), all together under the heading Microsystem 80.^[9]^[10] However, this naming scheme was quite temporary, lasting for a few years during the early 1980s.^[f]

Although the 8086 was primarily developed for

embedded systems and small multi-user or single-user computers, largely as a response to the successful 8080-compatible Zilog Z80,^[11] the x86 line soon grew in features and processing power. Today, x86 is ubiquitous in both stationary and portable personal computers, and is also used in midrange computers, workstations, servers, and most new supercomputer clusters of the TOP500 list. A large amount of software, including a large list of x86 operating systems

are using x86-based hardware.

Modern x86 is relatively uncommon in

Athlon Neo and Intel Atom are examples of 32- and 64-bit

designs used in some relatively low-power and low-cost segments.

There have been several attempts, including by Intel, to end the market dominance of the "inelegant" x86 architecture designed directly from the first simple 8-bit microprocessors. Examples of this are the

semiconductor manufacturing would make it hard to replace x86 in many segments. AMD's 64-bit extension of x86 (which Intel eventually responded to with a compatible design)^[13] and the scalability of x86 chips in the form of modern multi-core CPUs, is underlining x86 as an example of how continuous refinement of established industry standards can resist the competition from completely new architectures.^[14]

Chronology

The table below lists processor models and model series implementing various architectures in the x86 family, in chronological order. Each line item is characterized by significantly improved or commercially successful processor microarchitecture designs.

Chronology of x86 processors
Era		Introduction	Prominent CPU models	Address space			Notable features
Era		Introduction	Prominent CPU models	Linear	Virtual	Physical	Notable features
x86-16	1st	1978	Intel 8086, Intel 8088 (1979)	16-bit	NA	20-bit	IBM PC/XT (8088)
	1st	1982	Intel 80188 NEC V20 /V30 (1983)		NA	20-bit	8086-2 ISA, embedded (80186/80188)
	2nd	1982	Intel 80286 and clones		30-bit	24-bit	IBM PC/AT
IA-32	3rd	1985	AMD Am386 (1991)	32-bit	46-bit	32-bit	IBM PS/2
	4th (pipelining, cache)	1989	Am5x86 (1995)				pipelining, on-die x87 FPU (486DX), on-die cache
	5th ( Superscalar )	1993	Intel Pentium MMX (1996)				Superscalar, 64-bit databus, faster FPU, MMX (Pentium MMX), APIC, SMP
		1994	NexGen Nx586 AMD 5k86/K5 (1996)				Discrete microarchitecture (µ-op translation)
		1995	MII (1998)				dynamic execution
	6th (PAE, µ-op translation)	1995	Intel Pentium Pro			36-bit (PAE)	µ-op translation, conditional move instructions, L2 cache
		1997	Intel Pentium II, Pentium III (1999) Celeron (1998), Xeon (1998)			36-bit (PAE)	on-package (Pentium II) or on-die (Celeron) L2 Cache, SSE (Pentium III), Slot 1, Socket 370 or Slot 2 (Xeon)
		1997	AMD K6/K6-2 (1998)/K6-III (1999)			32-bit	3DNow!, 3-level cache system (K6-III)
	Enhanced Platform	1999	AMD MP (2001) Duron (2000) Sempron (2004)			36-bit	MMX+, 3DNow!+, double-pumped bus, Slot A or Socket A
		2000	Transmeta Crusoe			32-bit	CMS powered x86 platform processor, VLIW -128 core, on-die memory controller, on-die PCI bridge logic
		2000	Intel Pentium 4			36-bit	HTT (Northwood), NetBurst, quad-pumped bus, Trace Cache, Socket 478
		2003	Intel Pentium M Intel Core (2006) Pentium Dual-Core (2007)				XD bit (Dothan) (Intel Core "Yonah")
		2003	Transmeta Efficeon				HT
IA-64	64-bit Transition 1999–2005	2001	Intel Itanium (2001–2017)	52-bit			64-bit EPIC architecture, 128-bit VLIW instruction bundle, on-die hardware IA-32 H/W enabling x86 OSes & x86 applications (early generations), software IA-32 EL enabling x86 applications (Itanium 2), Itanium register files are remapped to x86 registers
x86-64	64-bit Extended since 2001	x86-64 is the 64-bit extended architecture of x86, its Legacy Mode preserves the entire and unaltered x86 architecture. The native architecture of x86-64 processors: residing in the 64-bit Mode, lacks of access mode in segmentation, presenting 64-bit architectural-permit linear address space; an adapted IA-32 architecture residing in the Compatibility Mode alongside 64-bit Mode is provided to support most x86 applications
		2003	X2 (2006)	40-bit			AMD-V (Athlon 64 Orleans), Socket 754/939/940 or AM2
		2004	Celeron D, Pentium D (2005)	36-bit			Intel VT(Pentium 4 6x2), socket LGA 775
		2006	Celeron Dual-Core (2008)	36-bit			Intel 64 (<<== EM64T), SSSE3(65 nm), wide dynamic execution, µ-op fusion, macro-op fusion in 16-bit and 32-bit mode,^[15]^[16] on-chip quad-core(Core 2 Quad), Smart Shared L2 Cache (Intel Core 2 "Merom")
		2007	Athlon II (2009) Turion II (2009)	48-bit			Monolithic quad-core (X4)/triple-core (X3), AM3
		2008	Intel Core 2 (45 nm)	40-bit			SSE4.1
			Intel Atom				netbook or low power smart device processor, P54C core reused
			Intel Core i3 (2010)				QuickPath, on-chip GMCH (Clarkdale), SSE4.2, Extended Page Tables (EPT) for virtualization, macro-op fusion in 64-bit mode,^[15]^[16] (Intel Xeon "Bloomfield" with Nehalem microarchitecture)
			VIA Nano				hardware-based encryption; adaptive power management
		2010	AMD FX	48-bit			octa-core, CMT(Clustered Multi-Thread), FMA, OpenCL, AM3+
		2011	AMD APU A and E Series ( Llano )	40-bit			on-die GPGPU, PCI Express 2.0, Socket FM1
			AMD APU C, E and Z Series ( Bobcat )	36-bit			low power smart device APU
			Sandy Bridge/Ivy Bridge )	36-bit			Internal Ring connection, decoded µ-op cache, LGA 1155 socket
		2012	AMD APU A Series ( Bulldozer, Trinity and later)	48-bit			AVX, Bulldozer based APU, Socket FM2 or Socket FM2+
		2012	Intel Xeon Phi (Knights Corner)				PCI-E add-on card coprocessor for XEON based system, Manycore Chip, In-order P54C , very wide VPU (512-bit SSE), LRBni instructions (8× 64-bit)
		2013	AMD Jaguar (Athlon, Sempron)				SoC, game console and low power smart device processor
			Intel Silvermont (Atom, Celeron, Pentium)	36-bit			SoC, low/ultra-low power smart device processor
			Core i7 (Haswell/Broadwell )	39-bit			BMI1, and BMI2 instructions, LGA 1150 socket
		2015	Intel Core M, Pentium, Celeron )	39-bit			SoC, on-chip Broadwell-U PCH-LP (Multi-chip module)
		2015–2020	Intel Core i9 )	46-bit			AVX-512 (restricted to Cannon Lake-U and workstation/server variants of Skylake)
		2016	Intel Xeon Phi (Knights Landing)	48-bit			Manycore CPU and coprocessor for Xeon systems, Airmont (Atom) based core
		2016	AMD Bristol Ridge (AMD (Pro) A6/A8/A10/A12)				Integrated FCH on die, SoC, AM4 socket
		2017	AMD Ryzen Series/AMD Epyc Series				AMD's implementation of SMT, on-chip multiple dies
		2017	Zhaoxin WuDaoKou (KX-5000, KH-20000)				Zhaoxin's first brand new x86-64 architecture
		2018–2021	Intel Sunny Cove (Ice Lake-U and Y), Cypress Cove (Rocket Lake)	57-bit			Intel's first implementation of AVX-512 for the consumer segment. Addition of Vector Neural Network Instructions (VNNI)
		2020	Intel Willow Cove (Tiger Lake-Y/U/H)				Dual ring interconnect architecture, updated Gaussian Neural Accelerator (GNA2), new AVX-512 Vector Intersection Instructions, addition of Control-Flow Enforcement Technology (CET)
		2021	Intel Alder Lake				Hybrid design with performance (Golden Cove) and efficiency cores (Gracemont), support for PCIe Gen5 and DDR5, updated Gaussian Neural Accelerator (GNA3)
Era		Introduction	Prominent CPU models	Address space			Notable features

History

Designers and manufacturers

At various times, companies such as

ITT Corporation, National Semiconductor, ULSI System Technology, and Weitek

.

Such x86 implementations were seldom simple copies but often employed different internal microarchitectures and different solutions at the electronic and physical levels. Quite naturally, early compatible microprocessors were 16-bit, while 32-bit designs were developed much later. For the personal computer market, real quantities started to appear around 1990 with i386 and i486 compatible processors, often named similarly to Intel's original chips.

After the fully

6x86MX (MII) lines of Cyrix designs, which were the first x86 microprocessors implementing register renaming to enable speculative execution

.

AMD meanwhile designed and manufactured the advanced but delayed

Nx586, it used a strategy such that dedicated pipeline stages decode x86 instructions into uniform and easily handled micro-operations

, a method that has remained the basis for most x86 designs to this day.

Some early versions of these microprocessors had heat dissipation problems. The 6x86 was also affected by a few minor compatibility problems, the

Nx586 lacked a floating-point unit (FPU) and (the then crucial) pin-compatibility, while the K5

had somewhat disappointing performance when it was (eventually) introduced.

Customer ignorance of alternatives to the Pentium series further contributed to these designs being comparatively unsuccessful, despite the fact that the

6x86 was significantly faster than the Pentium on integer code.^[j] AMD later managed to grow into a serious contender with the K6 set of processors, which gave way to the very successful Athlon and Opteron

.

There were also other contenders, such as

P5 Pentium

.

Many additions and extensions have been added to the original x86 instruction set over the years, almost consistently with full backward compatibility.^[k] The architecture family has been implemented in processors from Intel, Cyrix, AMD, VIA Technologies and many other companies; there are also open implementations, such as the Zet SoC platform (currently inactive).^[17] Nevertheless, of those, only Intel, AMD, VIA Technologies, and DM&P Electronics hold x86 architectural licenses, and from these, only the first two actively produce modern 64-bit designs, leading to what has been called a "duopoly" of Intel and AMD in x86 processors.

However, in 2014 the Shanghai-based Chinese company Zhaoxin, a joint venture between a Chinese company and VIA Technologies, began designing VIA based x86 processors for desktops and laptops. The release of its newest "7" family^[18] of x86 processors (e.g. KX-7000), which are not quite as fast as AMD or Intel chips but are still state of the art,^[19] had been planned for 2021; as of March 2022 the release had not taken place, however.^[20]

From 16-bit and 32-bit to 64-bit architecture

The instruction set architecture has twice been extended to a larger word size. In 1985, Intel released the 32-bit 80386 (later known as i386) which gradually replaced the earlier 16-bit chips in computers (although typically not in embedded systems) during the following years; this extended programming model was originally referred to as the i386 architecture (like its first implementation) but Intel later dubbed it IA-32 when introducing its (unrelated) IA-64 architecture.

In 1999–2003,

BSDs also use the "amd64" term. Microsoft Windows, for example, designates its 32-bit versions as "x86" and 64-bit versions as "x64", while installation files of 64-bit Windows versions are required to be placed into a directory called "AMD64".^[21]

In 2023, Intel proposed a major change to the architecture referred to as

x86-S (with S standing for "simplification"), which aims to remove support for legacy execution modes and instructions. A processor implementing this proposal would start execution directly in long mode and would only support 64-bit operating systems. 32-bit code would only be supported for user applications running in ring 3, and would use the same simplified segmentation as long mode.^[22]^[23]

Basic properties of the architecture

The x86 architecture is a variable instruction length, primarily "

integer arithmetic and memory addresses (or offsets) is 16, 32 or 64 bits depending on architecture generation (newer processors include direct support for smaller integers as well). Multiple scalar values can be handled simultaneously via the SIMD unit present in later generations, as described below.^[l]

Immediate addressing offsets and immediate data may be expressed as 8-bit quantities for the frequently occurring cases or contexts where a −128..127 range is enough. Typical instructions are therefore 2 or 3 bytes in length (although some are much longer, and some are single-byte).

To further conserve encoding space, most registers are expressed in opcodes using three or four bits, the latter via an opcode prefix in 64-bit mode, while at most one operand to an instruction can be a memory location.^[m] However, this memory operand may also be the destination (or a combined source and destination), while the other operand, the source, can be either register or immediate. Among other factors, this contributes to a code size that rivals eight-bit machines and enables efficient use of instruction cache memory. The relatively small number of general registers (also inherited from its 8-bit ancestors) has made register-relative addressing (using small immediate offsets) an important method of accessing operands, especially on the stack. Much work has therefore been invested in making such accesses as fast as register accesses—i.e., a one cycle instruction throughput, in most circumstances where the accessed data is available in the top-level cache.

Floating point and SIMD

A dedicated

backward compatible version of this functionality on the same microprocessor as the main processor. In addition to this, modern x86 designs also contain a SIMD-unit (see SSE below) where instructions can work in parallel on (one or two) 128-bit words, each containing two or four floating-point numbers

(each 64 or 32 bits wide respectively), or alternatively, 2, 4, 8 or 16 integers (each 64, 32, 16 or 8 bits wide respectively).

The presence of wide SIMD registers means that existing x86 processors can load or store up to 128 bits of memory data in a single instruction and also perform bitwise operations (although not integer arithmetic [n]) on full 128-bits quantities in parallel. Intel's Sandy Bridge processors added the Advanced Vector Extensions (AVX) instructions, widening the SIMD registers to 256 bits. The Intel Initial Many Core Instructions implemented by the Knights Corner Xeon Phi processors, and the AVX-512 instructions implemented by the Knights Landing Xeon Phi processors and by Skylake-X processors, use 512-bit wide SIMD registers.

Current implementations

During

branch prediction, register renaming, and memory dependence prediction), which means they may execute multiple (partial or complete) x86 instructions simultaneously, and not necessarily in the same order as given in the instruction stream.^[24]

Some Intel CPUs (

threads per core (Xeon Phi has four threads per core). Some Intel CPUs support transactional memory (TSX

).

When introduced, in the mid-1990s, this method was sometimes referred to as a "RISC core" or as "RISC translation", partly for marketing reasons, but also because these micro-operations share some properties with certain types of RISC instructions. However, traditional microcode (used since the 1950s) also inherently shares many of the same properties; the new method differs mainly in that the translation to micro-operations now occurs asynchronously. Not having to synchronize the execution units with the decode steps opens up possibilities for more analysis of the (buffered) code stream, and therefore permits detection of operations that can be performed in parallel, simultaneously feeding more than one execution unit.

The latest processors also do the opposite when appropriate; they combine certain x86 sequences (such as a compare followed by a conditional jump) into a more complex micro-op which fits the execution model better and thus can be executed faster or with fewer machine resources involved.

Another way to try to improve performance is to cache the decoded micro-operations, so the processor can directly access the decoded micro-operations from a special cache, instead of decoding them again. Intel followed this approach with the Execution Trace Cache feature in their NetBurst microarchitecture (for Pentium 4 processors) and later in the Decoded Stream Buffer (for Core-branded processors since Sandy Bridge).^[25]

VLIW

instruction set. Transmeta argued that their approach allows for more power efficient designs since the CPU can forgo the complicated decode step of more traditional x86 implementations.

Addressing modes

Addressing modes for 16-bit processor modes can be summarized by the formula:^[26]^[27]

{\begin{matrix}{\mathtt {CS}}:\\{\mathtt {DS}}:\\{\mathtt {SS}}:\\{\mathtt {ES}}:\end{matrix}}\ \ {\begin{pmatrix}\\{\begin{bmatrix}{\mathtt {BX}}\\{\mathtt {BP}}\end{bmatrix}}+{\begin{bmatrix}{\mathtt {SI}}\\{\mathtt {DI}}\end{bmatrix}}\\\\\end{pmatrix}}+{\rm {displacement}}

Addressing modes for 32-bit x86 processor modes^[28] can be summarized by the formula:^[29]

{\begin{matrix}{\mathtt {CS}}:\\{\mathtt {DS}}:\\{\mathtt {SS}}:\\{\mathtt {ES}}:\\{\mathtt {FS}}:\\{\mathtt {GS}}:\end{matrix}}\ \ {\begin{bmatrix}{\mathtt {EAX}}\\{\mathtt {EBX}}\\{\mathtt {ECX}}\\{\mathtt {EDX}}\\{\mathtt {ESP}}\\{\mathtt {EBP}}\\{\mathtt {ESI}}\\{\mathtt {EDI}}\end{bmatrix}}+{\begin{pmatrix}\\{\begin{bmatrix}{\mathtt {EAX}}\\{\mathtt {EBX}}\\{\mathtt {ECX}}\\{\mathtt {EDX}}\\{\mathtt {EBP}}\\{\mathtt {ESI}}\\{\mathtt {EDI}}\end{bmatrix}}*{\begin{bmatrix}1\\2\\4\\8\end{bmatrix}}\\\\\end{pmatrix}}+{\rm {displacement}}

Addressing modes for the 64-bit processor mode can be summarized by the formula:^[29]

{\begin{Bmatrix}\\{\begin{matrix}{\mathtt {FS}}:\\{\mathtt {GS}}:\end{matrix}}\ \ {\begin{bmatrix}\vdots \\{\mathtt {GPR}}\\\vdots \end{bmatrix}}+{\begin{pmatrix}\\{\begin{bmatrix}\vdots \\{\mathtt {GPR}}\\\vdots \\\end{bmatrix}}*{\begin{bmatrix}1\\2\\4\\8\end{bmatrix}}\\\\\end{pmatrix}}\\\\\hline \\{\begin{matrix}{\mathtt {RIP}}\end{matrix}}\\\\\end{Bmatrix}}+{\rm {displacement}}

Instruction relative addressing in 64-bit code (RIP + displacement, where RIP is the

shared libraries in some operating systems).^[30]

The 8086 had 64 KB of eight-bit (or alternatively 32 K-word of 16-bit)

stack in memory supported by computer hardware. Only words (two bytes) can be pushed to the stack. The stack grows toward numerically lower addresses, with SS:SP pointing to the most recently pushed item. There are 256 interrupts, which can be invoked by both hardware and software. The interrupts can cascade, using the stack to store the return address

.

x86 registers

16-bit

The original

address registers

, and may also be used for array indexing.

One of four possible 'segment registers' (CS, DS, SS and ES) is used to form a memory address. In the original 8086 / 8088 / 80186 / 80188 every address was built from a segment register and one of the general purpose registers. For example ds:si is the notation for an address formed as [16 * ds + si] to allow 20-bit addressing rather than 16 bits, although this changed in later processors. At that time only certain combinations were supported.

The

flags such as carry flag, overflow flag and zero flag. Finally, the instruction pointer (IP) points to the next instruction that will be fetched from memory and then executed; this register cannot be directly accessed (read or written) by a program.^[31]

The

80188

are essentially an upgraded 8086 or 8088 CPU, respectively, with on-chip peripherals added, and they have the same CPU registers as the 8086 and 8088 (in addition to interface registers for the peripherals).

The 8086, 8088, 80186, and 80188 can use an optional floating-point coprocessor, the 8087. The 8087 appears to the programmer as part of the CPU and adds eight 80-bit wide registers, st(0) to st(7), each of which can hold numeric data in one of seven formats: 32-, 64-, or 80-bit floating point, 16-, 32-, or 64-bit (binary) integer, and 80-bit packed decimal integer.^[10]^{: S-6, S-13..S-15} It also has its own 16-bit status register accessible through the fstsw instruction, and it is common to simply use some of its bits for branching by copying it into the normal FLAGS.^[32]

In the

80287

is the floating-point coprocessor for the 80286 and has the same registers as the 8087 with the same data formats.

32-bit

With the advent of the 32-bit

80386 processor, the 16-bit general-purpose registers, base registers, index registers, instruction pointer, and FLAGS register, but not the segment registers, were expanded to 32 bits. The nomenclature represented this by prefixing an "E" (for "extended") to the register names in x86 assembly language

. Thus, the AX register corresponds to the lower 16 bits of the new 32-bit EAX register, SI corresponds to the lower 16 bits of ESI, and so on. The general-purpose registers, base registers, and index registers can all be used as the base in addressing modes, and all of those registers except for the stack pointer can be used as the index in addressing modes.

Two new segment registers (FS and GS) were added. With a greater number of registers, instructions and operands, the machine code format was expanded. To provide backward compatibility, segments with executable code can be marked as containing either 16-bit or 32-bit instructions. Special prefixes allow inclusion of 32-bit instructions in a 16-bit segment or vice versa.

The 80386 had an optional floating-point coprocessor, the

80486

and all subsequent x86 models, the floating-point processing unit (FPU) is integrated on-chip.

The

Pentium MMX added eight 64-bit MMX integer vector registers (MM0 to MM7, which share lower bits with the 80-bit-wide FPU stack).^[35] With the Pentium III, Intel added a 32-bit Streaming SIMD Extensions (SSE) control/status register (MXCSR) and eight 128-bit SSE floating-point registers (XMM0 to XMM7).^[36]

64-bit

Starting with the

instruction pointer), to ease the implementation of position-independent code

, used in shared libraries in some operating systems.

128-bit

SIMD registers XMM0–XMM15 (XMM0–XMM31 when AVX-512 is supported).

256-bit

SIMD registers YMM0–YMM15 (YMM0–YMM31 when AVX-512 is supported). Lower half of each of the YMM registers maps onto the corresponding XMM register.

512-bit

SIMD registers ZMM0–ZMM31. Lower half of each of the ZMM registers maps onto the corresponding YMM register.

Miscellaneous/special purpose

x86 processors that have a protected mode, i.e. the 80286 and later processors, also have three descriptor registers (GDTR, LDTR, IDTR) and a task register (TR).

32-bit x86 processors (starting with the 80386) also include various special/miscellaneous registers such as

debug registers (DR0 through 3, plus 6 and 7), test registers (TR3 through 7; 80486 only), and model-specific registers (MSRs, appearing with the Pentium^[o]

).

AVX-512 has eight extra 64-bit mask registers K0–K7 for selecting elements in a vector register. Depending on the vector register and element widths, only a subset of bits of the mask register may be used by a given instruction.

Purpose

Although the main registers (with the exception of the instruction pointer) are "general-purpose" in the 32-bit and 64-bit versions of the instruction set and can be used for anything, it was originally envisioned that they be used for the following purposes:

AL/AH/AX/EAX/RAX: Accumulator
CL/CH/CX/ECX/RCX: Counter (for use with loops and strings)
DL/DH/DX/EDX/RDX: Extend the precision of the accumulator (e.g. combine 32-bit EAX and EDX for 64-bit integer operations in 32-bit code)
BL/BH/BX/EBX/RBX: Base index (for use with arrays)
SP/ESP/RSP: Stack pointer for top address of the stack.
BP/EBP/RBP: Stack base pointer for holding the address of the current
stack frame
.
SI/ESI/RSI: Source index for string operations.
DI/EDI/RDI: Destination index for string operations.
IP/EIP/RIP: Instruction pointer. Holds the program counter, the address of next instruction.

Segment registers:

CS: Code
DS: Data
SS: Stack
ES: Extra data
FS: Extra data #2
GS: Extra data #3

No particular purposes were envisioned for the other 8 registers available only in 64-bit mode.

Some instructions compile and execute more efficiently when using these registers for their designed purpose. For example, using AL as an accumulator and adding an immediate byte value to it produces the efficient add to AL opcode of 04h, whilst using the BL register produces the generic and longer add to register opcode of 80C3h. Another example is double precision division and multiplication that works specifically with the AX and DX registers.

Modern compilers benefited from the introduction of the sib byte (scale-index-base byte) that allows registers to be treated uniformly (minicomputer-like). However, using the sib byte universally is non-optimal, as it produces longer encodings than only using it selectively when necessary. (The main benefit of the sib byte is the orthogonality and more powerful addressing modes it provides, which make it possible to save instructions and the use of registers for address calculations such as scaling an index.) Some special instructions lost priority in the hardware design and became slower than equivalent small code sequences. A notable example is the LODSW instruction.

Structure

General Purpose Registers (A, B, C and D)
64	32	16	8
R?X
	E?X
		?X
		?H	?L

64-bit mode-only General Purpose Registers (R8, R9, R10, R11, R12, R13, R14, R15)
64	32	16	8
?
	?D
		?W
			?B

Segment Registers (C, D, S, E, F and G)
16	8
?S

Pointer Registers (S and B)
64	32	16	8
R?P
	E?P
		?P
			?PL

Note: The ?PL registers are only available in 64-bit mode.

Index Registers (S and D)
64	32	16	8
R?I
	E?I
		?I
			?IL

Note: The ?IL registers are only available in 64-bit mode.

Instruction Pointer Register (I)
64	32	16
RIP
	EIP
		IP

Operating modes

Real mode

Real Address mode,

MiB of memory can be addressed^[p]), direct software access to peripheral hardware, and no concept of memory protection or multitasking at the hardware level. All x86 CPUs in the 80286 series and later start up in real mode at power-on; 80186 CPUs and earlier had only one operational mode, which is equivalent to real mode in later chips. (On the IBM PC platform, direct software access to the IBM BIOS

routines is available only in real mode, since BIOS is written for real mode. However, this is not a property of the x86 CPU but of the IBM BIOS design.)

In order to use more than 64 KB of memory, the segment registers must be used. This created great complications for compiler implementors who introduced odd pointer modes such as "near", "far" and "huge" to leverage the implicit nature of segmented architecture to different degrees, with some pointers containing 16-bit offsets within implied segments and other pointers containing segment addresses and offsets within segments. It is technically possible to use up to 256 KB of memory for code and data, with up to 64 KB for code, by setting all four segment registers once and then only using 16-bit offsets (optionally with default-segment override prefixes) to address memory, but this puts substantial restrictions on the way data can be addressed and memory operands can be combined, and it violates the architectural intent of the Intel designers, which is for separate data items (e.g. arrays, structures, code units) to be contained in separate segments and addressed by their own segment addresses, in new programs that are not ported from earlier 8-bit processors with 16-bit address spaces.

Unreal mode

Unreal mode is used by some 16-bit

boot loaders

.

System Management Mode

The System Management Mode (SMM) is only used by the system firmware (BIOS/UEFI), not by operating systems and applications software. The SMM code is running in SMRAM.

Protected mode

In addition to real mode, the Intel 80286 supports protected mode, expanding addressable

ring levels used for hardware-based computer security. Each segment descriptor also contains a segment limit field which specifies the maximum offset that may be used with the segment. Because offsets are 16 bits, segments are still limited to 64 KB each in 80286 protected mode.^[38]

Each time a segment register is loaded in protected mode, the 80286 must read a 6-byte segment descriptor from memory into a set of hidden internal registers. Thus, loading segment registers is much slower in protected mode than in real mode, and changing segments very frequently is to be avoided. Actual memory operations using protected mode segments are not slowed much because the 80286 and later have hardware to check the offset against the segment limit in parallel with instruction execution.

The

paging, a mechanism making it possible to use paged virtual memory (with 4 KB page size). Paging allows the CPU to map any page of the virtual memory space to any page of the physical memory space. To do this, it uses additional mapping tables in memory called page tables. Protected mode on the 80386 can operate with paging either enabled or disabled; the segmentation mechanism is always active and generates virtual addresses that are then mapped by the paging mechanism if it is enabled. The segmentation mechanism can also be effectively disabled by setting all segments to have a base address of 0 and size limit equal to the whole address space; this also requires a minimally-sized segment descriptor table of only four descriptors (since the FS and GS segments need not be used).^[q]

Paging is used extensively by modern multitasking operating systems. Linux, 386BSD and Windows NT were developed for the 386 because it was the first Intel architecture CPU to support paging and 32-bit segment offsets. The 386 architecture became the basis of all further development in the x86 series.

x86 processors that support protected mode boot into real mode for backward compatibility with the older 8086 class of processors. Upon power-on (a.k.a. booting), the processor initializes in real mode, and then begins executing instructions. Operating system boot code, which might be stored in read-only memory, may place the processor into the protected mode to enable paging and other features. Conversely, segment arithmetic, a common practice in real mode code, is not allowed in protected mode.

Virtual 8086 mode

There is also a sub-mode of operation in 32-bit protected mode (a.k.a. 80386 protected mode) called virtual 8086 mode, also known as V86 mode. This is basically a special hybrid operating mode that allows real mode programs and operating systems to run while under the control of a protected mode supervisor operating system. This allows for a great deal of flexibility in running both protected mode programs and real mode programs simultaneously. This mode is exclusively available for the 32-bit version of protected mode; it does not exist in the 16-bit version of protected mode, or in long mode.

Long mode

In the mid 1990s, it was obvious that the 32-bit address space of the x86 architecture was limiting its performance in applications requiring large data sets. A 32-bit address space would allow the processor to directly address only 4 GB of data, a size surpassed by applications such as

EiB

of data, although most 64-bit architectures do not support access to the full 64-bit address space; for example, AMD64 supports only 48 bits from a 64-bit address, split into four paging levels.

In 1999, AMD published a (nearly) complete specification for a 64-bit extension of the x86 architecture which they called x86-64 with claimed intentions to produce. That design is currently used in almost all x86 processors, with some exceptions intended for embedded systems.

Mass-produced x86-64 chips for the general market were available four years later, in 2003, after the time was spent for working prototypes to be tested and refined; about the same time, the initial name x86-64 was changed to AMD64. The success of the AMD64 line of processors coupled with lukewarm reception of the IA-64 architecture forced Intel to release its own implementation of the AMD64 instruction set. Intel had previously implemented support for AMD64^[39] but opted not to enable it in hopes that AMD would not bring AMD64 to market before Itanium's new IA-64 instruction set was widely adopted. It branded its implementation of AMD64 as EM64T, and later rebranded it Intel 64.

In its literature and product version names, Microsoft and Sun refer to AMD64/Intel 64 collectively as x64 in the Windows and

Solaris operating systems. Linux distributions refer to it either as "x86-64", its variant "x86_64", or "amd64". BSD systems use "amd64" while macOS

uses "x86_64".

Long mode is mostly an extension of the 32-bit instruction set, but unlike the 16–to–32-bit transition, many instructions were dropped in the 64-bit mode. This does not affect actual binary backward compatibility (which would execute legacy code in other modes that retain support for those instructions), but it changes the way assembler and compilers for new code have to work.

This was the first time that a major extension of the x86 architecture was initiated and originated by a manufacturer other than Intel. It was also the first time that Intel accepted technology of this nature from an outside source.

Extensions

Floating-point unit

Early x86 processors could be extended with

co-processors with names like 8087, 80287 and 80387, abbreviated x87. This was also known as the NPX (Numeric Processor eXtension), an apt name since the coprocessors, while used mainly for floating-point calculations, also performed integer operations on both binary and decimal formats. With very few exceptions, the 80486 and subsequent x86 processors then integrated this x87 functionality on chip which made the x87 instructions a de facto

integral part of the x86 instruction set.

Each x87 register, known as ST(0) through ST(7), is 80 bits wide and stores numbers in the

IEEE floating-point standard

double extended precision format. These registers are organized as a stack with ST(0) as the top. This was done in order to conserve opcode space, and the registers are therefore randomly accessible only for either operand in a register-to-register instruction; ST0 must always be one of the two operands, either the source or the destination, regardless of whether the other operand is ST(x) or a memory operand. However, random access to the stack registers can be obtained through an instruction which exchanges any specified ST(x) with ST(0).

The operations include arithmetic and transcendental functions, including trigonometric and exponential functions, and instructions that load common constants (such as 0; 1; e, the base of the natural logarithm; log2(10); and log10(2)) into one of the stack registers. While the integer ability is often overlooked, the x87 can operate on larger integers with a single instruction than the 8086, 80286, 80386, or any x86 CPU without to 64-bit extensions can, and repeated integer calculations even on small values (e.g., 16-bit) can be accelerated by executing integer instructions on the x86 CPU and the x87 in parallel. (The x86 CPU keeps running while the x87 coprocessor calculates, and the x87 sets a signal to the x86 when it is finished or interrupts the x86 if it needs attention because of an error.)

MMX

MMX is a

Pentium MMX microprocessor.^[40] The MMX instruction set was developed from a similar concept first used on the Intel i860. It is supported on most subsequent IA-32 processors by Intel and other vendors. MMX is typically used for video processing (in multimedia applications, for instance).^[41]

MMX added 8 new registers to the architecture, known as MM0 through MM7 (henceforth referred to as MMn). In reality, these new registers were just aliases for the existing x87 FPU stack registers. Hence, anything that was done to the floating-point stack would also affect the MMX registers. Unlike the FP stack, these MMn registers were fixed, not relative, and therefore they were randomly accessible. The instruction set did not adopt the stack-like semantics so that existing operating systems could still correctly save and restore the register state when multitasking without modifications.[40]

Each of the MMn registers are 64-bit integers. However, one of the main concepts of the MMX instruction set is the concept of packed data types, which means instead of using the whole register for a single 64-bit integer (

quadword), one may use it to contain two 32-bit integers (doubleword), four 16-bit integers (word) or eight 8-bit integers (byte). Given that the MMX's 64-bit MMn registers are aliased to the FPU stack and each of the floating-point registers are 80 bits wide, the upper 16 bits of the floating-point registers are unused in MMX. These bits are set to all ones by any MMX instruction, which correspond to the floating-point representation of NaNs or infinities.^[40]

3DNow!

In 1997, AMD introduced 3DNow!.

vector processing performance of graphic-intensive applications. 3D video game developers and 3D graphics hardware vendors use 3DNow! to enhance their performance on AMD's K6 and Athlon series of processors.^[43]

3DNow! was designed to be the natural evolution of MMX from integers to floating point. As such, it uses exactly the same register naming convention as MMX, that is MM0 through MM7.[44] The only difference is that instead of packing integers into these registers, two single-precision floating-point numbers are packed into each register. The advantage of aliasing the FPU registers is that the same instruction and data structures used to save the state of the FPU registers can also be used to save 3DNow! register states. Thus no special modifications are required to be made to operating systems which would otherwise not know about them.^[45]

SSE and AVX

In 1999, Intel introduced the Streaming SIMD Extensions (SSE)

HyperThreading technology. AMD licensed the SSE3 instruction set and implemented most of the SSE3 instructions for its revision E and later Athlon 64 processors. The Athlon 64 does not support HyperThreading and lacks those SSE3 instructions used only for HyperThreading.^[46]

SSE discarded all legacy connections to the FPU stack. This also meant that this instruction set discarded all legacy connections to previous generations of SIMD instruction sets like MMX. But it freed the designers up, allowing them to use larger registers, not limited by the size of the FPU registers. The designers created eight 128-bit registers, named XMM0 through XMM7. (In AMD64, the number of SSE XMM registers has been increased from 8 to 16.) However, the downside was that operating systems had to have an awareness of this new set of instructions in order to be able to save their register states. So Intel created a slightly modified version of Protected mode, called Enhanced mode which enables the usage of SSE instructions, whereas they stay disabled in regular Protected mode. An OS that is aware of SSE will activate Enhanced mode, whereas an unaware OS will only enter into traditional Protected mode.

SSE is a SIMD instruction set that works only on floating-point values, like 3DNow!. However, unlike 3DNow! it severs all legacy connection to the FPU stack. Because it has larger registers than 3DNow!, SSE can pack twice the number of

double precision numbers too, which 3DNow! had no possibility of doing since a double precision number is 64-bit in size which would be the full size of a single 3DNow! MMn register. At 128 bits, the SSE XMMn registers could pack two double precision floats into one register. Thus SSE2 is much more suitable for scientific calculations than either SSE1 or 3DNow!, which were limited to only single precision. SSE3 does not introduce any additional registers.^[46]

The Advanced Vector Extensions (AVX) doubled the size of SSE registers to 256-bit YMM registers. It also introduced the VEX coding scheme to accommodate the larger registers, plus a few instructions to permute elements. AVX2 did not introduce extra registers, but was notable for the addition for masking,

gather

, and shuffle instructions.

AVX-512 features yet another expansion to 32 512-bit ZMM registers and a new EVEX scheme. Unlike its predecessors featuring a monolithic extension, it is divided into many subsets that specific models of CPUs can choose to implement.

Physical Address Extension (PAE)

Physical Address Extension or PAE was first added in the Intel Pentium Pro, and later by AMD in the Athlon processors,^[47] to allow up to 64 GB of RAM to be addressed. Without PAE, physical RAM in 32-bit protected mode is usually limited to 4 GB. PAE defines a different page table structure with wider page table entries and a third level of page table, allowing additional bits of physical address. Although the initial implementations on 32-bit processors theoretically supported up to 64 GB of RAM, chipset and other platform limitations often restricted what could actually be used. x86-64 processors define page table structures that theoretically allow up to 52 bits of physical address, although again, chipset and other platform concerns (like the number of DIMM slots available, and the maximum RAM possible per DIMM) prevent such a large physical address space to be realized. On x86-64 processors PAE mode must be active before the switch to long mode, and must remain active while long mode is active, so while in long mode there is no "non-PAE" mode. PAE mode does not affect the width of linear or virtual addresses.

x86-64

TOP 500 data and visualized on the diagram above, last updated 2013), the appearance of 64-bit extensions for the x86 architecture enabled 64-bit x86 processors by AMD and Intel (teal hatched and blue hatched, in the diagram, respectively) to replace most RISC processor architectures previously used in such systems (including PA-RISC, SPARC, Alpha, and others), and 32-bit x86 (green on the diagram), even though Intel initially tried unsuccessfully to replace x86 with a new incompatible 64-bit architecture in the Itanium processor. The main non-x86 architecture which is still used, as of 2014, in supercomputing clusters is the Power ISA used by IBM Power microprocessors

(blue with diamond tiling in the diagram), with SPARC as a distant second.

By the 2000s, 32-bit x86 processors' limits in memory addressing were an obstacle to their use in high-performance computing clusters and powerful desktop workstations. The aged 32-bit x86 was competing with much more advanced 64-bit RISC architectures which could address much more memory. Intel and the whole x86 ecosystem needed 64-bit memory addressing if x86 was to survive the 64-bit computing era, as workstation and desktop software applications were soon to start hitting the limits of 32-bit memory addressing. However, Intel felt that it was the right time to make a bold step and use the transition to 64-bit desktop computers for a transition away from the x86 architecture in general, an experiment which ultimately failed.

In 2001, Intel attempted to introduce a non-x86 64-bit architecture named IA-64 in its Itanium processor, initially aiming for the high-performance computing market, hoping that it would eventually replace the 32-bit x86.^[48] While IA-64 was incompatible with x86, the Itanium processor did provide emulation abilities for translating x86 instructions into IA-64, but this affected the performance of x86 programs so badly that it was rarely, if ever, actually useful to the users: programmers should rewrite x86 programs for the IA-64 architecture or their performance on Itanium would be orders of magnitude worse than on a true x86 processor. The market rejected the Itanium processor since it broke backward compatibility and preferred to continue using x86 chips, and very few programs were rewritten for IA-64.

AMD decided to take another path toward 64-bit memory addressing, making sure backward compatibility would not suffer. In April 2003, AMD released the first x86 processor with 64-bit general-purpose registers, the Opteron, capable of addressing much more than 4 GB of virtual memory using the new x86-64 extension (also known as AMD64 or x64). The 64-bit extensions to the x86 architecture were enabled only in the newly introduced long mode, therefore 32-bit and 16-bit applications and operating systems could simply continue using an AMD64 processor in protected or other modes, without even the slightest sacrifice of performance^[49] and with full compatibility back to the original instructions of the 16-bit Intel 8086.^[50]^: 13–14 The market responded positively, adopting the 64-bit AMD processors for both high-performance applications and business or home computers.

Seeing the market rejecting the incompatible Itanium processor and Microsoft supporting AMD64, Intel had to respond and introduced its own x86-64 processor, the Prescott Pentium 4, in July 2004.^[51] As a result, the Itanium processor with its IA-64 instruction set is rarely used and x86, through its x86-64 incarnation, is still the dominant CPU architecture in non-embedded computers.

x86-64 also introduced the

buffer overruns

.

As a result of AMD's 64-bit contribution to the x86 lineage and its subsequent acceptance by Intel, the 64-bit RISC architectures ceased to be a threat to the x86 ecosystem and almost disappeared from the workstation market. x86-64 began to be utilized in powerful

ARM 32-bit and 64-bit RISC architecture as a competitor in the smartphone and tablet

market.

Virtualization

Prior to 2005, x86 architecture processors were unable to meet the

free and open-source systems include QEMU, Kernel-based Virtual Machine, VirtualBox, and Xen

.

The introduction of the AMD-V and Intel VT-x instruction sets in 2005 allowed x86 processors to meet the Popek and Goldberg virtualization requirements.[52]

AES

APX (Advanced Performance Extensions)

APX (Advanced Performance Extensions) are extensions to double the number of general-purpose registers from 16 to 32 and add new features to improve general-purpose performance.^[53]^[54]^[55]^[56] These extensions have been called "generational"^[57] and "the biggest x86 addition since 64 bits".^[58] Intel contributed APX support to GNU Compiler Collection (GCC) 14.^[59]

According to the architecture specification,^[60] the main features of APX are:

16 additional general-purpose registers, called the Extended GPRs (EGPRs)
Three-operand instruction formats for many integer instructions
New conditional instructions for loads, stores, and comparisons with common instructions that don't modify flags
Optimized register save/restore operations
A 64-bit absolute direct jump instruction

Extended GPRs for general purpose instructions are encoded using 2-byte

AVX2/AVX-512 instructions are encoded with extended EVEX

prefix which has four variants used for different groups of instructions.

Notes

^ Unlike the microarchitecture (and specific electronic and physical implementation) used for a specific microprocessor design.
GRID Compass
laptop, for instance.

80286
processors.

7400 series support components, including multiplexers, buffers, and glue logic
.

^ The actual meaning of iAPX was Intel Advanced Performance Architecture, or sometimes Intel Advanced Processor Architecture.

^ late 1981 to early 1984, approximately

architectures
, which, due to the price sensitivity, low power, and hardware simplicity requirements, outnumber the x86.

^ The NEC V20 and V30 also provided the older 8080 instruction set, allowing PCs equipped with these microprocessors to operate CP/M applications at full speed (i.e., without the need to simulate an 8080 by software).

Fabless
companies designed the chip and contracted another company to manufacture it, while fabbed companies would do both the design and the manufacturing themselves. Some companies started as fabbed manufacturers and later became fabless designers, one such example being AMD.

^ It had a slower FPU however, which is slightly ironic as Cyrix started out as a designer of fast floating-point units for x86 processors.

P5 Pentium
during 1993 (as numbers could not be trademarked). However, the term x86 was already established among technicians, compiler writers etc.

^ 16-bit and 32-bit microprocessors were introduced during 1978 and 1985 respectively; plans for 64-bit was announced during 1999 and gradually introduced from 2003 and onwards.

^ Some "CISC" designs, such as the PDP-11, may use two.

^ That is because integer arithmetic generates carry between subsequent bits (unlike simple bitwise operations).

^ Two MSRs of particular interest are SYSENTER_EIP_MSR and SYSENTER_ESP_MSR, introduced on the Pentium® II processor, which store the address of the kernel mode system service handler and corresponding kernel stack pointer. Initialized during system startup, SYSENTER_EIP_MSR and SYSENTER_ESP_MSR are used by the SYSENTER (Intel) or SYSCALL (AMD) instructions to achieve Fast System Calls, about three times faster than the software interrupt method used previously.

^ Because a segmented address is the sum of a 16-bit segment multiplied by 16 and a 16-bit offset, the maximum address is 1,114,095 (10FFEF hex), for an addressability of 1,114,096 bytes = 1 MB + 65,520 bytes. Before the 80286, x86 CPUs had only 20 physical address lines (address bit signals), so the 21st bit of the address, bit 20, was dropped and addresses past 1 MB were mirrors of the low end of the address space (starting from address zero). Since the 80286, all x86 CPUs have at least 24 physical address lines, and bit 20 of the computed address is brought out onto the address bus in real mode, allowing the CPU to address the full 1,114,096 bytes reachable with an x86 segmented address. On the popular IBM PC platform, switchable hardware to disable the 21st address bit was added to machines with an 80286 or later so that all programs designed for 8088/8086-based models could run, while newer software could take advantage of the "high" memory in real mode and the full 16 MB or larger address space in protected mode—see A20 gate.

^ An extra descriptor record at the top of the table is also required, because the table starts at zero but the minimum descriptor index that can be loaded into a segment register is 1; the value 0 is reserved to represent a segment register that points to no segment.

References

^ Pryce, Dave (May 11, 1989). "80486 32-bit CPU breaks new ground in chip density and operating performance. (Intel Corp.) (product announcement) EDN" (Press release).

ISBN 978-81-203-3594-3
.

ISBN 978-81-8495-325-1
.

^ Alcorn, Paul (February 9, 2022). "AMD Sets All-Time CPU Market Share Record as Intel Gains in Desktop and Notebook PCs". Tom's Hardware.

^ Brandon, Jonathan (April 15, 2015). "The cloud beyond x86: How old architectures are making a comeback". ICloud PE. Business Cloud News. Archived from the original on August 19, 2021. Retrieved November 23, 2020. Despite the dominance of x86 in the datacentre it is difficult to ignore the noise vendors have been making over the past couple of years around non-x86 architectures like ARM...

^ "June 2022". TOP500.

Phoronix
. Retrieved June 1, 2022.

^ Dvorak, John C. "Whatever Happened to the Intel iAPX432?". Dvorak.org. Archived from the original on November 25, 2017. Retrieved April 18, 2014.

^ iAPX 286 Programmer's Reference (PDF). Intel. 1983. Archived (PDF) from the original on August 28, 2017. Retrieved August 28, 2017.

^ ^a ^b iAPX 86, 88 User's Manual (PDF). Intel. August 1981. Archived (PDF) from the original on August 28, 2017. Retrieved August 28, 2017.

^ Edwards, Benj (June 16, 2008). "Birth of a Standard: The Intel 8086 Microprocessor". PCWorld. Archived from the original on September 26, 2010. Retrieved September 14, 2014.

S2CID 16451604
.

AMD. October 5, 1999. Archived from the original
on March 2, 2000. "Time and again, processor architects have looked at the inelegant x86 architecture and declared it cannot be stretched to accommodate the latest innovations," said Nathan Brookwood, principal analyst, Insight 64.

^ Burt, Jeff (April 5, 2010). "Microsoft to End Intel Itanium Support". eWeek. Retrieved June 2, 2022.

^ ^a ^b "Intel 64 and IA-32 Architectures Optimization Reference Manual" (PDF). Intel. September 2019. 3.4.2.2 Optimizing for Macro-fusion. Archived (PDF) from the original on February 14, 2020. Retrieved March 7, 2020.

^ ^a ^b Fog, Agner. "The microarchitecture of Intel, AMD and VIA CPUs" (PDF). p. 107. Archived (PDF) from the original on March 22, 2019. Retrieved March 7, 2020. Core2 can do macro-op fusion only in 16-bit and 32-bit mode. Core Nehalem can also do this in 64-bit mode.

^ "Zet: The x86 (IA-32) open implementation: Overview". OpenCores. November 4, 2013. Archived from the original on February 11, 2018. Retrieved January 5, 2014.

^ "Zhaoxin Preparing Linux Kernel Support For 7-Series Centaur CPUs". www.phoronix.com. Retrieved April 5, 2022.

^ "Zhaoxin aiming at 2021 release for its 7nm x86 CPUs - CPU - News - HEXUS.net". m.hexus.net. Retrieved April 5, 2022.

^ "Zhaoxin Finally Adding "Lujiazui" x86_64 CPU Tuning To GCC". www.phoronix.com. Retrieved April 5, 2022.

^ "Setup and installation considerations for Windows x64 Edition-based computers". Archived from the original on September 11, 2014. Retrieved September 14, 2014.

^ "Envisioning a Simplified Intel Architecture". Intel.

Phoronix
. Retrieved May 20, 2023.

^ "Processors — What mode of addressing do the Intel Processors use?". Archived from the original on September 11, 2014. Retrieved September 14, 2014.

^ "DSB Switches". Intel VTune Amplifier 2013. Intel. Archived from the original on December 2, 2013. Retrieved August 26, 2013.

^ "The 8086 Family User's Manual" (PDF). Intel Corporation. October 1979. p. 2-68. Archived (PDF) from the original on April 4, 2018. Retrieved March 28, 2018.

^ "iAPX 286 Programmer's Reference Manual" (PDF). Intel Corporation. 1983. 2.4.3 Memory Addressing Modes. Archived (PDF) from the original on August 28, 2017. Retrieved August 28, 2017.

^ 80386 Programmer's Reference Manual (PDF). Intel Corporation. 1986. 2.5.3.2 EFFECTIVE-ADDRESS COMPUTATION. Archived (PDF) from the original on December 28, 2018. Retrieved March 28, 2018.

^ ^a ^b Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 1: Basic Architecture. Intel Corporation. March 2018. Chapter 3. Archived from the original on January 26, 2012. Retrieved March 19, 2014.

OCLC 1050453850
.

^ "Guide to x86 Assembly". Cs.virginia.edu. September 11, 2013. Archived from the original on March 24, 2020. Retrieved February 6, 2014.

^ "FSTSW/FNSTSW — Store x87 FPU Status Word". Archived from the original on January 25, 2022. Retrieved January 15, 2020. The FNSTSW AX form of the instruction is used primarily in conditional branching...

^ Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture (PDF). Intel. March 2013. Chapter 8. Archived (PDF) from the original on April 2, 2013. Retrieved April 23, 2013.

^ "Intel 80287 family". CPU-world. Archived from the original on August 9, 2016. Retrieved July 21, 2016.

^ Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture (PDF). Intel. March 2013. Chapter 9. Archived (PDF) from the original on April 2, 2013. Retrieved April 23, 2013.

^ Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture (PDF). Intel. March 2013. Chapter 10. Archived (PDF) from the original on April 2, 2013. Retrieved April 23, 2013.

^ iAPX 286 Programmer's Reference (PDF). Intel. 1983. Section 1.2, "Modes of Operation". Archived (PDF) from the original on August 28, 2017. Retrieved January 27, 2014.

^ iAPX 286 Programmer's Reference (PDF). Intel. 1983. Chapter 6, "Memory Management and Virtual Addressing". Archived (PDF) from the original on August 28, 2017. Retrieved January 27, 2014.

^ "Intel's Yamhill Technology: x86-64 compatible |Geek.com". Archived from the original on September 5, 2012. Retrieved July 18, 2008.

^ ^a ^b ^c "Programming With the Intel MMX™ Technology". Embedded Pentium® Processor Family Technical Information Center. Intel. Archived from the original on July 25, 2003. Retrieved June 5, 2022.

ISSN 1937-4771
.

^ Sexton, Michael Justin Allen (April 21, 2017). "The History Of AMD CPUs". Tom's Hardware. Retrieved June 5, 2022.

^ Shimpi, Anand Lal (October 29, 1998). "AMD's K6-2 350: Something to do..." AnandTech. Retrieved June 5, 2022.

^ "Intel's MMX and AMD's 3DNow! SIMD Operations". web.mit.edu. Retrieved June 5, 2022.

^ "3DNow!™ Technology Manual" (PDF). Advanced Micro Devices. Retrieved June 5, 2022.

^ ^a ^b "Upgrading And Repairing PCs 21st Edition: Processor Features". Tom's Hardware. October 31, 2013. Retrieved June 5, 2022.

^ AMD, Inc. (February 2002). "Appendix E" (PDF). AMD Athlon™ Processor x86 Code Optimization Guide (Revision K ed.). p. 250. Archived (PDF) from the original on April 13, 2017. Retrieved April 13, 2017. A 2-bit index consisting of PCD and PWT bits of the page table entry is used to select one of four PAT register fields when PAE (page address extensions) is enabled, or when the PDE doesn't describe a large page.

Techworld. Archived from the original
on February 19, 2011. Retrieved December 19, 2010. Once touted by Intel as a replacement for the x86 product line, expectations for Itanium have been throttled well back.

^ "IBM WebSphere Application Server 64-bit Performance Demystified" (PDF). IBM Corporation. September 6, 2007. p. 14. Archived (PDF) from the original on January 25, 2022. Retrieved April 9, 2010. Figures 5, 6 and 7 also show the 32-bit version of WAS runs applications at full native hardware performance on the POWER and x86-64 platforms. Unlike some 64-bit processor architectures, the POWER and x86-64 hardware does not emulate 32-bit mode. Therefore applications that do not benefit from 64-bit features can run with full performance on the 32-bit version of WebSphere running on the above mentioned 64-bit platforms.

^ "Volume 2: System Programming" (PDF). AMD64 Architecture Programmer's Manual. AMD Corporation. March 2024. Archived (PDF) from the original on April 4, 2024. Retrieved April 24, 2024.

^ Charlie Demerjian (September 26, 2003). "Why Intel's Prescott will use AMD64 extensions". The Inquirer. Archived from the original on October 10, 2009. Retrieved October 7, 2009.

^ Adams, Keith; Agesen, Ole (October 21–25, 2006). A Comparison of Software and Hardware Techniques for x86 Virtualization (PDF). Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, USA, 2006. ACM 1-59593-451-0/06/0010. Archived (PDF) from the original on August 20, 2010. Retrieved December 22, 2006.

^ Winkel, Sebastian; Agron, Jason. "Advanced Performance Extensions (APX)". Intel. Retrieved October 22, 2023.

^ Robinson, Dan. "Intel adds fresh x86 and vector instructions for future chips". The Register. Retrieved October 22, 2023.

^ Bonshor, Gavin. "Intel Unveils AVX10 and APX Instruction Sets: Unifying AVX-512 For Hybrid Architectures". AnandTech. Retrieved October 22, 2023.

^ Alcorn, Paul (July 24, 2023). "Intel's New AVX10 Brings AVX-512 Capabilities to E-Cores". Tom's Hardware. Retrieved October 22, 2023.

^ Shah, Agam (August 9, 2023). "Intel's Generational On-Chip Change APX Will Make All the Apps Faster". The New Stack. Retrieved October 22, 2023.

^ Byrne, Joseph. "APX is Biggest x86 Addition Since 64 Bits". Tech Insights.

^ Larabel, Michael. "Intel APX Code Begins Landing Within The GCC Compiler". Phoronix. Retrieved October 22, 2023.

^ "Intel® Advanced Performance Extensions (Intel® APX) Architecture Specification". Intel. July 21, 2023. Retrieved October 22, 2023.

Further reading

Rosenblum, Mendel; Garfinkel, Tal (May 2005). "Virtual machine monitors: current technology and future trends". IEEE Computer. 38 (5): 39–47.
S2CID 10385623
.

External links

Wikimedia Commons has media related to X86 architecture.

Wikibooks has a book on the topic of: X86 Assembly/X86 Architecture

Why Intel can't seem to retire the x86

32/64-bit x86 Instruction Reference

Intel Intrinsics Guide, an interactive reference tool for Intel intrinsic instructions

Intel® 64 and IA-32 Architectures Software Developer's Manuals

AMD Developer Guides, Manuals & ISA Documents, AMD64 Architecture

v
t
e
x86 assembly topics
Topics

Assembly language

Comparison of assemblers

Disassembler

Instruction set

Low-level programming language

Machine code

Microassembler

x86 assembly language

Assemblers

A86/A386

Flat Assembler (FASM)

GNU Assembler (GAS)

High Level Assembly (HLA)

Microsoft Macro Assembler (MASM)

Netwide Assembler (NASM)

Turbo Assembler (TASM)

Open Watcom Assembler (WASM)

Programming
issues

Call stack

Flags
Carry flag

Direction flag

Interrupt flag

Overflow flag

Zero flag

Opcode

Program counter

Processor register

Calling conventions

Instruction listings

Registers

Instruction set extensions
SIMD (RISC)

Alpha
MVI

ARM
NEON

SVE

MIPS
MDMX

MIPS-3D

MXU

MIPS SIMD

PA-RISC
MAX

Power ISA
VMX

SPARC
VIS

SIMD (x86)

MMX (1996)

3DNow! (1998)

SSE (1999)

SSE2 (2001)

SSE3 (2004)

SSSE3 (2006)

SSE4 (2006)

SSE5 ~~(2007)~~

AVX (2008)

F16C (2009)

XOP (2009)

FMA (FMA4: 2011, FMA3: 2012)

AVX2
(2013)

AVX-512 (2015)

AMX (2022)

AVX10
(2023)

Bit manipulation

BMI
(ABM: 2007, BMI1: 2012, BMI2: 2013, TBM: 2012)

ADX (2014)

Compressed instructions

Thumb

MIPS16e ASE

RVC

Security and cryptography

PadLock (2003)

AES-NI (2008); ARMv8 also has AES instructions

CLMUL (2010)

RDRAND (2012)

SHA (2013)

MPX (2015)

SGX (2015)

TDX (2021)

Transactional memory

TSX (2013)

ASF

Virtualization

VT-x (2005)

AMD-V (2006)

VT-d (AMD-Vi)

Suspended extensions' dates are ~~struck through~~.

v
t
e
Intel
Subsidiaries

3Dlabs

Altera

Intel Security

Mobileye

Recon Instruments

Virtutech

Wind River Systems

Xircom

Joint venture
4Group Holdings (50% owned by
Technicolor SA
)
Products

3D XPoint

Accounts & SSO

Amplify Tablet

Advanced Programmable Interrupt Controller

Cache Acceleration Software

Client Initiated Remote Access

Direct Media Interface

Flexible Display Interface

Hella Zippy

Intel 1103

Intel AZ210

Intel Clear Video

Intel Display Power Saving Technology

Intel Modular Server System

Intel Quick Sync Video

Intel Reader

Intel system development kit

Intel Upgrade Service

Intel740

InTru3D

IXP1200

OFono

Omni-Path

Performance acceleration technology

Shooting Star

SSDs (X25-M)

Stable Image Platform Program

Virtual 8086 mode

WiDi

x86

v
t
e
Intel processors
Lists

Processors
Atom

Celeron

Pentium
Pro

II

III

4

D

M

Core
2

i3

i5

i7

i9

M

Xeon

Quark

Itanium

Microarchitectures

Chipsets

Microarchitectures
IA-32 (32-bit)

P5

P6
P6 variant (Pentium M)

P6 variant (Enhanced Pentium M)

NetBurst

x86-64 (64-bit)

Core
Penryn

Nehalem
Westmere

Sandy Bridge
Ivy Bridge

Haswell
Broadwell

Skylake
Cannon Lake

Sunny Cove
Cypress Cove

Willow Cove

Golden Cove

x86 ULV

Bonnell
Saltwell

Silvermont

Goldmont
Goldmont Plus

Tremont
Gracemont

Current products
x86-64 (64-bit)

Atom

Celeron

Pentium

Core
10th gen

11th gen

12th gen

13th gen

14th gen

Xeon

Discontinued
BCD oriented (4-bit)

4004 (1971)

4040 (1974)

pre-x86 (8-bit)

8008 (1972)

8080 (1974)

8085 (1977)

Early x86 (16-bit)

8086 (1978)

8088 (1979)

80186 (1982)

80188
(1982)

80286 (1982)

x87 (external FPUs)

8/16-bit databus

8087 (1980)

16-bit databus

80C187

80287

80387SX

32-bit databus

80387DX

80487

IA-32 (32-bit)

80386
SX

376

EX

80486
SX

DX2

DX4

SL

RapidCAD

OverDrive

A100/A110

Atom
CE

SoC

Celeron (1998)
M

D (2004)

Pentium
Original P5

OverDrive

Pro

II

II OverDrive

III

4

M

Dual-Core

Core

Xeon
P6-based

NetBurst-based

Core-based

Quark

Tolapai

x86-64 (64-bit)

Atom
SoC

CE

Celeron
D

Dual-Core

Pentium
4

D

Extreme Edition

Dual-Core

Core
2

1st gen

2nd gen

3rd gen

4th gen

5th gen

6th gen

7th gen

8th gen

9th gen

10th gen

11th gen

M

Xeon
Nehalem-based

Sandy Bridge-based

Ivy Bridge-based

Haswell-based

Broadwell-based

Skylake-based

Other

CISC

iAPX 432

EPIC

Itanium

RISC

i860

i960

StrongARM

XScale

Related

Tick–tock model

Process–architecture–optimization model

Intel GPUs
GMA

Intel HD, UHD, and Iris Graphics

Xe

Arc

PCHs

SCHs

ICHs

PIIXs

Stratix

Codenames

Larrabee

Litigation

Advanced Micro Devices, Inc. v. Intel Corp.

High-Tech Employee Antitrust Litigation

Intel Corp. v. Advanced Micro Devices, Inc.

Intel Corp. v. Hamidi

Intel Corporation Inc. v CPM United Kingdom Ltd

Silvaco Data Systems v. Intel Corp.

People
Founders

Gordon Moore

Robert Noyce

CEOs

Robert Noyce

Gordon Moore

Andrew Grove

Craig Barrett

Paul Otellini

Brian Krzanich

Bob Swan

Pat Gelsinger

Related

Intel Foundation Achievement Award

Mac transition to Intel processors

Intel Architecture Labs

ASCI Red

BiiN

Classmate PC

Convera Corporation

Copy Exactly!

Intel Developer Forum

Dynamic video memory technology

Intel Extreme Masters

List of Intel microprocessors

List of Intel graphics processing units (2013 or earlier)

I/O Acceleration Technology

IA-32 Execution Layer

IM Flash Technologies

The Innovators

Inside Films
Inside

The Beauty Inside

The Power Inside

Intel ADX

Intel Capital

Intel Cluster Ready

Intel Compute Stick

Intel Ireland

Intel Mobile Communications

Intel Outstanding Researcher Award

Intel SHA extensions

Intel Teach

List of semiconductor fabrication plants

List of Intel manufacturing sites

List of mergers and acquisitions by Intel

Intel Museum

OnCue

Intel PRO/Wireless

Intel International Science and Engineering Fair

Regeneron Science Talent Search

Simple Firmware Interface

Single-chip Cloud Computer

Software Guard Extensions

Supervisor Mode Access Prevention

Tarari

Intel Tera-Scale

Timeline of Intel

v
t
e
Processor technologies
Models

Abstract machine

Stored-program computer

Finite-state machine
with datapath

Hierarchical

Deterministic finite automaton

Queue automaton

Cellular automaton

Quantum cellular automaton

Turing machine
Alternating Turing machine

Universal

Post–Turing

Quantum

Nondeterministic Turing machine

Probabilistic Turing machine

Hypercomputation

Zeno machine

Belt machine

Stack machine

Register machines
Counter

Pointer

Random-access

Random-access stored program

Architecture

Microarchitecture

Von Neumann

Harvard
modified

Dataflow

Transport-triggered

Cellular

Endianness

Memory access
NUMA

HUMA

Load–store

Register/memory

Cache hierarchy

Memory hierarchy
Virtual memory

Secondary storage

Heterogeneous

Fabric

Multiprocessing

Cognitive

Neuromorphic

Instruction set
architectures
Types

Orthogonal instruction set

CISC

RISC

Application-specific

EDGE
TRIPS

VLIW
EPIC

MISC

OISC

NISC

ZISC

VISC architecture

Quantum computing

Comparison
Addressing modes

Instruction
sets

Motorola 68000 series

VAX

PDP-11

x86

ARM

Stanford MIPS

MIPS

MIPS-X

Power
POWER

PowerPC

Power ISA

Clipper architecture

SPARC

SuperH

DEC Alpha

ETRAX CRIS

M32R

Unicore

Itanium

OpenRISC

RISC-V

MicroBlaze

LMC

System/3x0
S/360

S/370

S/390

z/Architecture

Tilera ISA

VISC architecture

Epiphany architecture

Others

Execution
Instruction pipelining

Pipeline stall

Operand forwarding

Classic RISC pipeline

Hazards

Data dependency

Structural

Control

False sharing

Out-of-order

Scoreboarding

Tomasulo's algorithm
Reservation station

Re-order buffer

Register renaming

Wide-issue

Speculative

Branch prediction

Memory dependence prediction

Parallelism
Level

Bit
Bit-serial

Word

Instruction

Pipelining
Scalar

Superscalar

Task
Thread

Process

Data
Vector

Memory

Distributed

Multithreading

Temporal

Simultaneous
Hyperthreading

Simultaneous and heterogenous

Speculative

Preemptive

Cooperative

Flynn's taxonomy

SISD

SIMD
Array processing (SIMT)

Pipelined processing

Associative processing

SWAR

MISD

MIMD
SPMD

Processor
performance

Transistor count

Instructions per cycle (IPC)
Cycles per instruction (CPI)

Instructions per second (IPS)

Floating-point operations per second (FLOPS)

Transactions per second (TPS)

Synaptic updates per second (SUPS)

Performance per watt (PPW)

Cache performance metrics

Computer performance by orders of magnitude

Types

Central processing unit (CPU)

Graphics processing unit (GPU)
GPGPU

Vector

Barrel

Stream

Tile processor

Coprocessor

PAL

ASIC

FPGA

FPOA

CPLD

Multi-chip module (MCM)

System in a package (SiP)

Package on a package (PoP)

By application

Embedded system

Microprocessor

Microcontroller

Mobile

Ultra-low-voltage

ASIP

Soft microprocessor

Systems
on chip

System on a chip (SoC)

Multiprocessor
(MPSoC)

Cypress PSoC

Network on a chip (NoC)

Hardware
accelerators

Coprocessor

AI accelerator

Graphics processing unit (GPU)

Image processor

Vision processing unit (VPU)

Physics processing unit (PPU)

Digital signal processor (DSP)

Tensor Processing Unit (TPU)

Secure cryptoprocessor

Network processor

Baseband processor

Word size

1-bit

4-bit

8-bit

12-bit

15-bit

16-bit

24-bit

32-bit

48-bit

64-bit

128-bit

256-bit

512-bit

bit slicing

others
variable

Core count

Single-core

Multi-core

Manycore

Heterogeneous architecture

Components

Core

Cache
CPU cache

Scratchpad memory

Data cache

Instruction cache

replacement policies

coherence

Bus

Clock rate

Clock signal

FIFO

Functional
units

Arithmetic logic unit (ALU)

Address generation unit (AGU)

Floating-point unit (FPU)

Memory management unit (MMU)
Load–store unit

Translation lookaside buffer (TLB)

Branch predictor

Branch target predictor

Integrated memory controller (IMC)
Memory management unit

Instruction decoder

Logic

Combinational

Sequential

Glue

Logic gate
Quantum

Array

Registers

Processor register

Status register

Stack register

Register file

Memory buffer

Memory address register

Program counter

Control unit

Hardwired control unit

Instruction unit

Data buffer

Write buffer

Microcode ROM

Counter

Datapath

Multiplexer

Demultiplexer

Adder

Multiplier
CPU

Binary decoder
Address decoder

Sum-addressed decoder

Barrel shifter

Circuitry

Integrated circuit
3D

Mixed-signal

Power management

Boolean

Digital

Analog

Quantum

Switch

Power
management

PMU

APM

ACPI

Dynamic frequency scaling

Dynamic voltage scaling

Clock gating

Performance per watt (PPW)

Related

History of general-purpose CPUs

Microprocessor chronology

Processor design

Digital electronics

Hardware security module

Semiconductor device fabrication

Tick–tock model

Pin grid array

Chip carrier

Authority control databases
National

France

BnF data

Israel

United States

Other

IdRef

Retrieved from "https://en.wikipedia.org/w/index.php?title=X86&oldid=1220606710"

[4] Unlike the microarchitecture (and specific electronic and physical implementation) used for a specific microprocessor design.

[5] GRID Compass
laptop, for instance.

[10] 80286
processors.

[11] 7400 series support components, including multiplexers, buffers, and glue logic
.

[13] The actual meaning of iAPX was Intel Advanced Performance Architecture, or sometimes Intel Advanced Processor Architecture.

[16] te 1981 to early 1984, approximately

[18] rchitectures
, which, due to the price sensitivity, low power, and hardware simplicity requirements, outnumber the x86.

[24] The NEC V20 and V30 also provided the older 8080 instruction set, allowing PCs equipped with these microprocessors to operate CP/M applications at full speed (i.e., without the need to simulate an 8080 by software).

[25] Fabless
companies designed the chip and contracted another company to manufacture it, while fabbed companies would do both the design and the manufacturing themselves. Some companies started as fabbed manufacturers and later became fabless designers, one such example being AMD.

[26] It had a slower FPU however, which is slightly ironic as Cyrix started out as a designer of fast floating-point units for x86 processors.

[27] P5 Pentium
during 1993 (as numbers could not be trademarked). However, the term x86 was already established among technicians, compiler writers etc.

[35] 16-bit and 32-bit microprocessors were introduced during 1978 and 1985 respectively; plans for 64-bit was announced during 1999 and gradually introduced from 2003 and onwards.

[36] Some "CISC" designs, such as the PDP-11, may use two.

[37] That is because integer arithmetic generates carry between subsequent bits (unlike simple bitwise operations).

[51] Two MSRs of particular interest are SYSENTER_EIP_MSR and SYSENTER_ESP_MSR, introduced on the Pentium® II processor, which store the address of the kernel mode system service handler and corresponding kernel stack pointer. Initialized during system startup, SYSENTER_EIP_MSR and SYSENTER_ESP_MSR are used by the SYSENTER (Intel) or SYSCALL (AMD) instructions to achieve Fast System Calls, about three times faster than the software interrupt method used previously.

[53] Because a segmented address is the sum of a 16-bit segment multiplied by 16 and a 16-bit offset, the maximum address is 1,114,095 (10FFEF hex), for an addressability of 1,114,096 bytes = 1 MB + 65,520 bytes. Before the 80286, x86 CPUs had only 20 physical address lines (address bit signals), so the 21st bit of the address, bit 20, was dropped and addresses past 1 MB were mirrors of the low end of the address space (starting from address zero). Since the 80286, all x86 CPUs have at least 24 physical address lines, and bit 20 of the computed address is brought out onto the address bus in real mode, allowing the CPU to address the full 1,114,096 bytes reachable with an x86 segmented address. On the popular IBM PC platform, switchable hardware to disable the 21st address bit was added to machines with an 80286 or later so that all programs designed for 8088/8086-based models could run, while newer software could take advantage of the "high" memory in real mode and the full 16 MB or larger address space in protected mode—see A20 gate.

[55] An extra descriptor record at the top of the table is also required, because the table starts at zero but the minimum descriptor index that can be loaded into a segment register is 1; the value 0 is reserved to represent a segment register that points to no segment.

[1] Pryce, Dave (May 11, 1989). "80486 32-bit CPU breaks new ground in chip density and operating performance. (Intel Corp.) (product announcement) EDN" (Press release).

[2] ISBN 978-81-203-3594-3
.

[3] ISBN 978-81-8495-325-1
.

[6] Alcorn, Paul (February 9, 2022). "AMD Sets All-Time CPU Market Share Record as Intel Gains in Desktop and Notebook PCs". Tom's Hardware.

[7] Brandon, Jonathan (April 15, 2015). "The cloud beyond x86: How old architectures are making a comeback". ICloud PE. Business Cloud News. Archived from the original on August 19, 2021. Retrieved November 23, 2020. Despite the dominance of x86 in the datacentre it is difficult to ignore the noise vendors have been making over the past couple of years around non-x86 architectures like ARM...

[8] "June 2022". TOP500.

[9] Phoronix
. Retrieved June 1, 2022.

[12] Dvorak, John C. "Whatever Happened to the Intel iAPX432?". Dvorak.org. Archived from the original on November 25, 2017. Retrieved April 18, 2014.

[i286-14] iAPX 286 Programmer's Reference (PDF). Intel. 1983. Archived (PDF) from the original on August 28, 2017. Retrieved August 28, 2017.

[i86-15] iAPX 86, 88 User's Manual (PDF). Intel. August 1981. Archived (PDF) from the original on August 28, 2017. Retrieved August 28, 2017.

[17] Edwards, Benj (June 16, 2008). "Birth of a Standard: The Intel 8086 Microprocessor". PCWorld. Archived from the original on September 26, 2010. Retrieved September 14, 2014.

[19] S2CID 16451604
.

[20] AMD. October 5, 1999. Archived from the original
on March 2, 2000. "Time and again, processor architects have looked at the inelegant x86 architecture and declared it cannot be stretched to accommodate the latest innovations," said Nathan Brookwood, principal analyst, Insight 64.

[21] Burt, Jeff (April 5, 2010). "Microsoft to End Intel Itanium Support". eWeek. Retrieved June 2, 2022.

[intel-optimization-for-macro-fusion-22] "Intel 64 and IA-32 Architectures Optimization Reference Manual" (PDF). Intel. September 2019. 3.4.2.2 Optimizing for Macro-fusion. Archived (PDF) from the original on February 14, 2020. Retrieved March 7, 2020.

[agner-fog-microarchitecture-23] Fog, Agner. "The microarchitecture of Intel, AMD and VIA CPUs" (PDF). p. 107. Archived (PDF) from the original on March 22, 2019. Retrieved March 7, 2020. Core2 can do macro-op fusion only in 16-bit and 32-bit mode. Core Nehalem can also do this in 64-bit mode.

[28] "Zet: The x86 (IA-32) open implementation: Overview". OpenCores. November 4, 2013. Archived from the original on February 11, 2018. Retrieved January 5, 2014.

[29] "Zhaoxin Preparing Linux Kernel Support For 7-Series Centaur CPUs". www.phoronix.com. Retrieved April 5, 2022.

[30] "Zhaoxin aiming at 2021 release for its 7nm x86 CPUs - CPU - News - HEXUS.net". m.hexus.net. Retrieved April 5, 2022.

[31] "Zhaoxin Finally Adding "Lujiazui" x86_64 CPU Tuning To GCC". www.phoronix.com. Retrieved April 5, 2022.

[32] "Setup and installation considerations for Windows x64 Edition-based computers". Archived from the original on September 11, 2014. Retrieved September 14, 2014.

[33] "Envisioning a Simplified Intel Architecture". Intel.

[34] Phoronix
. Retrieved May 20, 2023.

[38] "Processors — What mode of addressing do the Intel Processors use?". Archived from the original on September 11, 2014. Retrieved September 14, 2014.

[39] "DSB Switches". Intel VTune Amplifier 2013. Intel. Archived from the original on December 2, 2013. Retrieved August 26, 2013.

[40] "The 8086 Family User's Manual" (PDF). Intel Corporation. October 1979. p. 2-68. Archived (PDF) from the original on April 4, 2018. Retrieved March 28, 2018.

[41] "iAPX 286 Programmer's Reference Manual" (PDF). Intel Corporation. 1983. 2.4.3 Memory Addressing Modes. Archived (PDF) from the original on August 28, 2017. Retrieved August 28, 2017.

[42] 80386 Programmer's Reference Manual (PDF). Intel Corporation. 1986. 2.5.3.2 EFFECTIVE-ADDRESS COMPUTATION. Archived (PDF) from the original on December 28, 2018. Retrieved March 28, 2018.

[addrmodes-43] Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 1: Basic Architecture. Intel Corporation. March 2018. Chapter 3. Archived from the original on January 26, 2012. Retrieved March 19, 2014.

[Andriesse_2019-44] OCLC 1050453850
.

[45] "Guide to x86 Assembly". Cs.virginia.edu. September 11, 2013. Archived from the original on March 24, 2020. Retrieved February 6, 2014.

[46] "FSTSW/FNSTSW — Store x87 FPU Status Word". Archived from the original on January 25, 2022. Retrieved January 15, 2020. The FNSTSW AX form of the instruction is used primarily in conditional branching...

[47] Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture (PDF). Intel. March 2013. Chapter 8. Archived (PDF) from the original on April 2, 2013. Retrieved April 23, 2013.

[48] "Intel 80287 family". CPU-world. Archived from the original on August 9, 2016. Retrieved July 21, 2016.

[49] Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture (PDF). Intel. March 2013. Chapter 9. Archived (PDF) from the original on April 2, 2013. Retrieved April 23, 2013.

[50] Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture (PDF). Intel. March 2013. Chapter 10. Archived (PDF) from the original on April 2, 2013. Retrieved April 23, 2013.

[52] iAPX 286 Programmer's Reference (PDF). Intel. 1983. Section 1.2, "Modes of Operation". Archived (PDF) from the original on August 28, 2017. Retrieved January 27, 2014.

[54] iAPX 286 Programmer's Reference (PDF). Intel. 1983. Chapter 6, "Memory Management and Virtual Addressing". Archived (PDF) from the original on August 28, 2017. Retrieved January 27, 2014.

[56] "Intel's Yamhill Technology: x86-64 compatible |Geek.com". Archived from the original on September 5, 2012. Retrieved July 18, 2008.

[intel-57] "Programming With the Intel MMX™ Technology". Embedded Pentium® Processor Family Technical Information Center. Intel. Archived from the original on July 25, 2003. Retrieved June 5, 2022.

[58] ISSN 1937-4771
.

[59] Sexton, Michael Justin Allen (April 21, 2017). "The History Of AMD CPUs". Tom's Hardware. Retrieved June 5, 2022.

[60] Shimpi, Anand Lal (October 29, 1998). "AMD's K6-2 350: Something to do..." AnandTech. Retrieved June 5, 2022.

[61] "Intel's MMX and AMD's 3DNow! SIMD Operations". web.mit.edu. Retrieved June 5, 2022.

[62] "3DNow!™ Technology Manual" (PDF). Advanced Micro Devices. Retrieved June 5, 2022.

[tomshardware-63] "Upgrading And Repairing PCs 21st Edition: Processor Features". Tom's Hardware. October 31, 2013. Retrieved June 5, 2022.

[Athlon_PAE-64] AMD, Inc. (February 2002). "Appendix E" (PDF). AMD Athlon™ Processor x86 Code Optimization Guide (Revision K ed.). p. 250. Archived (PDF) from the original on April 13, 2017. Retrieved April 13, 2017. A 2-bit index consisting of PCD and PWT bits of the page table entry is used to select one of four PAT register fields when PAE (page address extensions) is enabled, or when the PDE doesn't describe a large page.

[65] Techworld. Archived from the original
on February 19, 2011. Retrieved December 19, 2010. Once touted by Intel as a replacement for the x86 product line, expectations for Itanium have been throttled well back.

[x86-compat-perf-66] "IBM WebSphere Application Server 64-bit Performance Demystified" (PDF). IBM Corporation. September 6, 2007. p. 14. Archived (PDF) from the original on January 25, 2022. Retrieved April 9, 2010. Figures 5, 6 and 7 also show the 32-bit version of WAS runs applications at full native hardware performance on the POWER and x86-64 platforms. Unlike some 64-bit processor architectures, the POWER and x86-64 hardware does not emulate 32-bit mode. Therefore applications that do not benefit from 64-bit features can run with full performance on the 32-bit version of WebSphere running on the above mentioned 64-bit platforms.

[amd-24593-67] "Volume 2: System Programming" (PDF). AMD64 Architecture Programmer's Manual. AMD Corporation. March 2024. Archived (PDF) from the original on April 4, 2024. Retrieved April 24, 2024.

[68] Charlie Demerjian (September 26, 2003). "Why Intel's Prescott will use AMD64 extensions". The Inquirer. Archived from the original on October 10, 2009. Retrieved October 7, 2009.

[69] Adams, Keith; Agesen, Ole (October 21–25, 2006). A Comparison of Software and Hardware Techniques for x86 Virtualization (PDF). Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, USA, 2006. ACM 1-59593-451-0/06/0010. Archived (PDF) from the original on August 20, 2010. Retrieved December 22, 2006.

[70] Winkel, Sebastian; Agron, Jason. "Advanced Performance Extensions (APX)". Intel. Retrieved October 22, 2023.

[71] Robinson, Dan. "Intel adds fresh x86 and vector instructions for future chips". The Register. Retrieved October 22, 2023.

[72] Bonshor, Gavin. "Intel Unveils AVX10 and APX Instruction Sets: Unifying AVX-512 For Hybrid Architectures". AnandTech. Retrieved October 22, 2023.

[73] Alcorn, Paul (July 24, 2023). "Intel's New AVX10 Brings AVX-512 Capabilities to E-Cores". Tom's Hardware. Retrieved October 22, 2023.

[74] Shah, Agam (August 9, 2023). "Intel's Generational On-Chip Change APX Will Make All the Apps Faster". The New Stack. Retrieved October 22, 2023.

[75] Byrne, Joseph. "APX is Biggest x86 Addition Since 64 Bits". Tech Insights.

[76] Larabel, Michael. "Intel APX Code Begins Landing Within The GCC Compiler". Phoronix. Retrieved October 22, 2023.

[77] "Intel® Advanced Performance Extensions (Intel® APX) Architecture Specification". Intel. July 21, 2023. Retrieved October 22, 2023.

[7]

[d]

[8]

[e]

[9]

[10]

[f]

[11]

[13]

[14]

[15]

[16]

[j]

[k]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[l]

[m]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[35]

[36]

[o]

[p]

[38]

[q]

[39]

[40]

[41]

[43]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

[53]

[54]

[55]

[56]

[57]

[58]

[59]

[60]