SSE2
From Free net encyclopedia
SSE2 is one of the IA-32 SIMD instruction sets, first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier version SSE instruction set, and is intended to fully supplant MMX. SSE2 has itself been extended by SSE3, also known as "Prescott New Instructions", introduced by Intel to the Pentium 4 in early 2004.
Rival chip-maker AMD added support for SSE2 with the introduction of their Opteron and Athlon 64 ranges of 64-bit CPUs in 2003, and in 2005 added support for the SSE3 instruction set with an updated "E" revision of their processors.
Contents |
Changes
SSE2 adds support for 64-bit double-precision floating point and for 64, 32, 16 and 8-bit integer operations on the eight 128-bit XMM registers first introduced with SSE. SSE2 adds no additional program state to that provided by SSE.
The addition of 128-bit integer SIMD operations allows the programmer to completely avoid the eight 64-bit MMX registers "aliased" on the original IA-32 floating point register stack. This permits mixing integer SIMD and scalar floating point operations without mode switching required between MMX and x87 floating point operations. However, this is overshadowed by the value of being able to perform integer SIMD operations on the wider SSE registers.
Other SSE2 extensions include a set of cache-control instructions intended primarily to minimize cache pollution when processing indefinite streams of information, and a sophisticated complement of numeric format conversion instructions.
AMD's implementation of SSE2 on the AMD64 platform includes an additional 8 registers, doubling the total number to 16 (XMM0 through XMM15). These additional registers are only visible when running in 64-bit mode. Intel adopted these additional registers as part of their support for AMD64 architecture (renamed EM64T) in 2004.
Differences between x87 double-precision and SSE2
The FPU (x87) instructions always store intermediate results with 80-bits of precision. When legacy FPU software algorithms are ported to SSE2, certain combinations of math operations or input datasets can result in measurable numerical deviation. This is of critical importance to scientific computations, if the calculation results must be compared against results generated from a different machine architecture.
Compiler Usage
When first introduced in 2000, SSE2 was not supported by software development tools. For example, to use SSE2 in a Microsoft Developer Studio project, the programmer had to either manually write inline-assembly or import object-code from an external source (such as Microsoft MASM.)
The Intel C Compiler can automatically generate SSE/SSE2-code without the use of hand-coded assembly, letting programmers focus on algorithmic development instead of assembly-level implementation. Since its introduction, the Intel C Compiler has greatly increased adoption of SSE2 in Windows application development.
CPUs supporting SSE2
- AMD Athlon 64
- AMD Athlon 64 X2
- AMD Opteron
- AMD Sempron (Socket 754/939 versions only)
- AMD Turion 64
- Intel Pentium 4
- Intel Pentium D
- Intel Pentium EE
- Intel Pentium M
- Intel Celeron (Socket 478 versions only)
- Intel Celeron D
- Intel Celeron M
- Intel Core Solo/Duo
- Transmeta Efficeon