X86 architecture
From Free net encyclopedia
x86 or 80x86 is the generic name of a microprocessor architecture first developed and manufactured by Intel. The x86 architecture currently dominates the desktop computer, portable computer, and small server markets.
The architecture is called x86 because the earliest processors in this family were identified by model numbers ending in the sequence "86": the 8086, the 80186, the 80286, the 386, and the 486. Because one cannot trademark numbers, Intel and most of its competitors began to use trademarkable names such as Pentium for subsequent generations of processors, but the earlier naming scheme has stuck as a term for the entire family.
The architecture has twice been extended to a larger word size. In 1985, Intel released the 32-bit 386 to replace the 16-bit 286. The 32-bit architecture is called x86-32 or IA-32 (an abbreviation for Intel Architecture, 32-bit). In 2003, AMD introduced the Athlon 64, which implemented a further extension to the architecture to 64 bits, variously called x86-64, AMD64 (AMD), EM64T or IA-32e (Intel), and x64 (Microsoft).
Contents |
History
The x86 architecture first appeared inside the Intel 8086 CPU in 1978; the 8086 was a development of the Intel 8080 processor (which itself followed the 4004 and 8008), and programs in 8080 assembler language could be mechanically translated to equivalent programs in 8086 assembler language. It was adopted (in the externally simpler 8088 version) three years later as the standard CPU of the IBM PC. The ubiquity of the PC platform has resulted in the x86 becoming one of the most successful CPU architectures ever. (Another extremely successful CPU design, based on and instruction-set compatible at the machine-language binary level with the 8080, is the Zilog Z80 architecture.)
Other companies also manufacture or have manufactured CPUs conforming to the x86 architecture: examples include Cyrix (now owned by VIA Technologies), NEC Corporation, IBM, IDT (now also owned by VIA), and Transmeta. The most successful of the clone manufacturers has been AMD, whose Athlon series, whilst not as popular as the Pentium series, has a significant marketshare. According to several market research companies AMD topped Intel's retail desktop CPU sales in 2006.
Note that Intel also introduced a separate 64-bit architecture used in its Itanium processors which it calls IA-64 or more recently IPF (Itanium Processor Family). IA-64 is a completely new system that bears no resemblance whatsoever to the x86 architecture; it should not be confused with IA-32, which is essentially synonymous with the 32-bit version of x86.
Design
The x86 architecture is a CISC design with variable instruction length. Word sized memory access is allowed to unaligned memory addresses. Words are stored in the little-endian order. Backwards compatibility has always been a driving force behind the development of the x86 architecture (the design decisions this has required are often criticised, particularly by proponents of competing processors, who are frustrated by the continued success of an architecture widely perceived as quantifiably inferior). Current x86 processors employ a few "extra" decoding steps to (during execution) split (most) x86 instructions into smaller pieces (called "micro-ops") which are then readily executed by a RISC-like micro-architecture.
The x86 assembly language is discussed in more detail in the x86 assembly language article.
Real mode
The Intel 8086 and 8088 had 14 16-bit registers. Four of them (AX, BX, CX, DX) were general purpose (although each had also an additional purpose; for example only CX can be used as a counter with the loop instruction). Each could be accessed as two separate bytes (thus BX's high byte can be accessed as BH and low byte as BL). In addition to them, there are four segment registers (CS, DS, SS and ES). They are used to form a memory address. There are two pointer registers (SP which points to the bottom of the stack, and BP which can be used to point at some other place in the stack or the memory). There are two index registers (SI and DI) which can be used to point inside an array. Finally, there is the flag register (containing flags such as carry, overflow, zero and so on), and the instruction pointer (IP) which points at the current instruction.
In real mode, memory access is segmented. This is done by shifting the segment address left by 4 bits and adding an offset in order to receive a final 20-bit address. For example, if DS is A000h and SI is 5677h, DS:SI will point at the absolute address DS × 16 + SI = A5677h. Thus the total address space in real mode is 220 bytes, or 1 MiB, quite an impressive figure for 1978. All memory addresses consist of both a segment and offset; every type of access (code, data, or stack) has a default segment register associated with it (for data the register is usually DS, for code it is CS, and for stack it is SS). For data accesses, the segment register can be explicitly specified (using a segment override prefix) to use any of the four segment registers.
In this scheme, two different segment/offset pairs can point at a single absolute location. Thus, if DS is A111h and SI is 4567h, DS:SI will point at the same A5677h as above. In addition to duplicity, this scheme also makes it impossible to use more than four segments at once. Moreover, CS and SS are vital for the correct functioning of the program, so that only DS and ES can be used to point to data segments outside the program (or, more prcecisely, outside the currently-executing segment of the program) or the stack. This scheme, which was intended as a compatibility measure with the Intel 8085, is often cited by programmers as a cause of much grief (though some programmers do not mind it so much, and the popularity of the x86 in the years before protected mode was introduced testifies that this is not an extremely serious flaw).
In addition to the above-stated, the 8086 also had 64 KB of 8-bit (or alternatively 32 K-word of 16-bit) I/O space, and a 64 KB (one segment) stack in memory supported by hardware (using the aforementioned SS, SP, and BP registers). Only words (2 bytes) can be pushed to the stack. The stack grows downwards (toward numerically lower addresses), its bottom being pointed by SS:SP. There are 256 interrupts, which can be invoked by both hardware and software. The interrupts can cascade, using the stack to store the return address.
Modern 32-bit x86 CPUs still support real mode, and in fact start up in real mode after reset. Real mode code running on these processors can take advantage of the 32-bit wide registers and additional segment registers (FS and GS) offered since the 80386.
16-bit protected mode
The Intel 80286 could support 8086 real mode 16-bit OSes without any changes, however it also supported another mode of work called the protected mode, which expanded addressable physical memory to 16MB and addressable virtual memory to 1 GB. This was done by using the segment registers only for storing an index to a segment table. There were two such tables, the GDT and the LDT, holding each up to 8192 segment descriptors, each segment giving access to up to 64 KB of memory. The segment table provided a 24-bit base address, which could then be added to the desired offset to create an absolute address. In addition, each segment could be given one of four privilege levels (called the rings).
Although the introductions were an improvement, they were not widely used because a protected mode operating system could not run existing real mode software as processes. Actually, in theory it could, but many DOS programs do direct hardware access and some do segment arithmetic and therefore could not run directly in protected mode.
So in the 386, Intel introduced Virtual 8086 mode, in which it is still subject to paging but used the real mode way to form linear address and allowed the OS to trap I/O accesses and, through paging, trap memory accesses.
In the meantime, operating systems like OS/2 tried to ping-pong the processor between protected and real modes. This was both slow and unsafe, as in real mode a program could easily crash the computer. OS/2 also defined restrictive programming rules which allowed a Family API or bound program to run either in real mode or in protected mode. This was however about running programs originally designed for protected mode, not vice-versa. By design, protected mode programs did not suppose that there is a relation between selector values and physical addresses. It is sometimes mistakenly believed that problems with running real mode code in 16-bit protected mode resulted from IBM having chosen to use Intel reserved interrupts for BIOS calls. It is actually related to such programs using arbitrary selector values and performing "segment arithmetic" described above on them and also direct hardware access.
This problem also appeared with Windows 3.0. Optimally, this release wanted to run programs in 16-bit protected mode, while previously they were running in real mode. Theoretically, if a Windows 1.x or 2.x program was written "properly" and avoided segment arithmetic it would run indifferently in both real and protected modes. Windows programs generally avoided segment arithmetic because Windows implemented a software virtual memory scheme and moved program code and data in memory when programs were not running, so manipulating absolute addresses was dangerous; programs were supposed to only keep handles to memory blocks when not running, and such handles were quite similar to protected-mode selectors already. Starting an old program while Windows 3.0 was running in protected mode triggered a warning dialog, suggesting to either run Windows in real mode (it could presumably still use expanded memory, possibly emulated with EMM386 on 80386 machines, so it was not limited to 640 KB) or to obtain an updated version from the vendor. Well-behaved programs could be "blessed" using a special tool to avoid this dialog. It was not possible to have some GUI programs running in 16-bit protected mode and other GUI programs running in real mode, probably because this would require having two separate environments and (on 80286) would be subject to the previously mentioned ping-ponging of the processor between modes. In version 3.1 real mode disappeared.
32-bit protected mode
The Intel 80386 introduced, perhaps, the greatest leap so far in the x86 architecture. With the notable exception of the Intel 80386SX, which was 32-bit yet only had 24-bit addressing (and a 16-bit data bus), it was all 32-bit - all the registers, instructions, I/O space and memory. To work with the latter, it used a 32-bit extension of Protected Mode. As it was in the 286, segment registers were used to index inside a segment table that described the division of memory. Unlike the 286, however, inside each segment one could use 32-bit offsets, which allowed every application to access up to 4 GB without segmentation and even more if segmentation was used. In addition, 32-bit protected mode supported paging, a mechanism which made it possible to use virtual memory.
No new general-purpose registers were added. All 16-bit registers except the segment ones were expanded to 32 bits. Intel represented this by adding "E" to the register mnemonics (thus the expanded AX became EAX, SI became ESI and so on). Since there was a greater number of registers, instructions and operands, the machine code format was expanded as well. In order to provide backwards compatibility, the segments which contain executable code can be marked as containing either 16 or 32 bit instructions. In addition, special prefixes can be used to include 32-bit instructions in a 16-bit segment and vice versa.
Paging and segmented memory access were both required in order to support a modern multitasking operating system. Linux, 386BSD, Windows NT were all initially developed for the 386, because it was the first CPU that made it possible to reliably support the separation of programs' memory space (each into its own address space) and the preemption of them in the case of necessity (using rings). The basic architecture of the 386 became the basis of all further development in the x86 series.
The Intel 80387 math co-processor was integrated into the next CPU in the series, the Intel 80486. The new FPU could be used to make floating point calculations, important for scientific calculation and graphic design.
MMX and beyond
1996 saw the appearance of the MMX (Matrix Math Extensions, though sometimes incorrectly referred to as Multi-Media Extensions) technology by Intel. While the new technology has been advertised widely and vaguely, its essence is very simple: MMX defined eight 64-bit SIMD registers overlayed onto the FPU stack to the Intel Pentium CPU design. Unfortunately, these instructions were not easily mappable to the code generated by ordinary C compilers, and Microsoft, the dominant compiler vendor, was slow to support them even as intrinsics. MMX is also limited to integer operations. These technical shortcomings caused MMX to have little impact in its early existence. Nowadays, MMX is typically used for some 2D video applications.
3DNow!
In 1997 AMD introduced 3DNow! which consisted of SIMD floating point instruction enhancements to MMX (targeting the same MMX registers). While this did not solve the compiler difficulties, the introduction of this technology coincided with the rise of 3D entertainment applications in the PC space. 3D video game developers and 3D graphics hardware vendors used 3DNow! to help enhance their performance on AMD's K6 and Athlon series of processors.
SSE
In 1999 Intel introduced the SSE instruction set which added eight new 128 bit registers (not overlayed with other registers). These instructions were analogous to AMD's 3DNow! in that they primarily added floating point SIMD.
SSE2
In 2000 Intel introduced the SSE2 instruction set which added 1) a complete complement of integer instructions (analogous to MMX) to the original SSE registers and 2) 64-bit SIMD floating point instructions to the original SSE registers. The first addition made MMX almost obsolete, and the second allowed the instructions to be realistically targeted by conventional compilers.
SSE3
Introduced in 2004 along with the Prescott revision of the Pentium 4 processor, SSE3 added specific memory and thread-handling instructions to boost the performance of Intel's HyperThreading technology. AMD later licensed the SSE3 instruction set for its latest (E) revision Athlon 64 processors. The SSE3 instruction set included on the new Athlons are only lacking a couple of the instructions that Intel designed for HyperThreading, since the Athlon 64 does not support HyperThreading; however SSE3 is still recognized in software as being supported on the platform.
64-bit
By 2002, it was obvious that the 32-bit address space of the x86 architecture was limiting its performance in applications requiring large data sets. A 32-bit address space would allow the processor to directly address only 4 GB of data - a size frequently surpassed by applications such as video processing or database engines.
Intel had originally decided not to extend x86 to 64-bit as they had to 32-bits, and instead introduced a new architecture called IA-64. IA-64 technology is the basis for its Itanium line of processors. IA-64 provides a backward compatibility for older 32-bit x86; this mode of operation, however, is exceedingly slow.
AMD took the initiative of extending the 32-bit x86 (which Intel calls IA-32) to 64-bit. It came up with an architecture, called AMD64 (or x86-64, prior to rebranding), and based the Opteron and Athlon 64 family of processors on this technology. The success of the AMD64 line of processors coupled with the lukewarm reception of the IA-64 architecture prompted Intel to adopt the AMD64 instruction set, adding some new extensions of its own and branding it the EM64T architecture. In its literature and product version names, Microsoft refers to this processor architecture as x64.
This was the first time that a major upgrade of the x86 architecture was initiated and originated by a manufacturer other than Intel. Perhaps more importantly, it was the first time that Intel actually accepted technology of this nature from an outside source.
Virtualization
x86 virtualization is difficult because the architecture does not meet the Popek and Goldberg virtualization requirements. Nevertheless, there are several commercial x86 virtualization products, such as VMware and Microsoft Virtual PC. There is also an open source virtualization project Xen. Intel and AMD have both announced that future x86 processors will have new enhancements to facilitate more efficient virtualization. Intel's code names for their virtualization features are "Vanderpool" and "Silvervale"; AMD uses the code name "Pacifica".
System-on-a-chip (SOC)
An x86 system-on-a-chip is a combination of an x86 CPU core with a northbridge (memory controller) and a southbridge (input/output (I/O) controller) in a single integrated circuit (IC).
Manufacturers
x86 and compatibles have been designed, manufactured and sold by a number of companies, including:
Template:Col-begin Template:Col-break
See also
- IA-32
- x86 assembly language
- x86 instruction listings
- Real mode — Unreal mode — Virtual 8086 mode — Protected mode — Long mode
External links
- 8086/80186/80286/80386/80486 Instruction Set
- x86 and x86-64 Instruction Set at sandpile.org
- AMD Geode Series
- The ChipList – By Adrian Offerman
- CPU-INFO: x86 processor information and indepth processor history
- VIA bought IDT CPU division
- List of SOC List of System-On-Chip (SOC) based on X86 core.
- X86 cpus' guide
- National Instrument Geodeca:X86
cs:X86 da:X86 de:X86-Prozessor es:X86 fr:X86 ko:X86 it:Architettura X86 hu:X86 nl:X86-instructieset ja:80x86 no:X86-arkitektur pl:X86 pt:X86 ru:X86 sk:X86 fi:X86 sv:X86 uk:X86 zh:X86