64bit computing

(1)

Moving from 32 to 64 bits while maintaining compatibility

Orlando Ricardo Nunes Rocha

Informatics Department, University of Minho 4710 Braga, Portugal

orocha@deb.uminho.pt

Abstract.

The EM64T is a recent technology adopted by INTEL, allowing the new processors (Dual core, Xeon) to run 64‐bit software. To compete with the rival AMD, Intel had to adapt its own x86 architecture increasing the memory addressing capabilities of their 32‐bit processors.The advantage of using 64‐bit computing is the ability to work with a higher range of integer values and a larger memory support. To provide the X86 architecture with 64‐bit capacities, INTEL added eight general‐

purposed 64‐bit registers and eight 64‐bit registers for streaming SIMD extensions. A new instruction pointer called RIP was been included like other new functionalities like fast interrupt‐prioritization mechanism, uniform byte‐register addressing and a new instruction‐pointer relative‐addressing mode. The arithmetic and logical operations are now directly supported for 64‐bit integers, and pushes an pops on the stack are executed with eight‐byte strides. This new technology allows both 32 and 64‐bit applications to run simultaneously on a system with a 64‐bit operating system.

The problems related to the 32‐64 bit compatibility can be decreased or avoid, allowing software producers a gradual transition for modifying their 32‐bit to 64‐bit software.

Introduction

A new concept of supporting 32‐bit and 64‐bit software by the same processor has been emerging in the last few years. AMD was the first company to explore this new concept.

It was followed by the Intel, which named its own technology EM64T (Extended Memory 64 Technology). The principal idea behind these new features is to allow applications to address larger amounts of memory and support the coexistence of 32‐bit and 64‐bit applications in the same processor [7]. EM64T extends virtual and physical memory addressing beyond the 4 GB limit of current 32‐bit processors [14]. Some modifications had to be made in the instruction set to support these new features. The system register that manages the new 64‐bit extensions and at same time the old 32‐bit instruction had to be altered; new registers were added to support 64‐bit integers and to increase the performance of CPU, reducing the number of times that the CPU has access the memory to load instructions or load and save data. Intel also had to modify the memory access bus to 64‐bits because a larger amount of data has to be transported.

A new generation of Intel processors have the EM64T extensions included, such as newer versions of Pentium 4, Pentium D, Pentium Extreme Edition, Celeron D, Xeon, and Core2 processors [4]. At present, the most used processor in servers and workstations is the Xeon

(2)

processor, offering to customers a reliable 64‐bit support. It allows to install and run all of the existing 32‐bit applications, in which 32‐bit execution remains critical, with excellent performance. The 64‐bit scientific, engineering, and design applications that need larger memory support can be used without code recompilation[9].

The cluster SEARCH of the department of informatics of University of Minho has 96 recent Xeon processors, so it’s possible to take advantage of all EM64T characteristics that will be mentioned in this paper, maintaining the compatibility between 32‐bit and 64‐bit applications with a low‐cost 64‐bit support.

64bit computing

There are three areas that can be classified as 64‐bit data, addressing, and software environment. 64‐bit integer registers, 64‐bit floating point registers, and 64‐bit data paths between processor, memory, cache memory and registers has to be present in the processor architecture[12]. There are two main advantages using a 64‐bit architecture, which are a higher range of integer values and a larger memory support. The increase of integer values representation allows to scientific and simulations applications fewer calculations to generate the same result as a 32‐bit applications. This is more relevant for applications that do a large number of calculations with integers, because earlier in 32‐bit registers it was necessary a larger number of registers (double) to represent a 64‐bit number, causing many more accesses to the memory. However in the applications that use floating point math, the increase of speed is not so relevant, because the registers have the same length of 80‐bits. 64‐bit applications can address up to 16 Exabytes of RAM, but nowadays most of PCs have an artificial limit on the amount of memory they can recognize, due to physical constraints[3]. In table 1 shows the enlargement of memory from 4‐bit to 64‐bit computing.

Table 1: Differences in scale between architectures; source [13]

Bits Binary Number of memory addresses

4 24 16

9 28 256

10 210 1024

16 216 65,536

32 232 4,294,967,297

64 264 18,446,744,073,709,600,000

EM64T technology

EM64T is an increment at the Intel IA‐32 architecture, providing a coexistence of 32‐bit and 64‐bit computing in a single processor. This is the idea behind the EM64T concept. To take advantage of this technology the chipset of motherboards and BIOS require a 64‐bit support, as well a 64‐bit operating system [14].

(3)

Enhancement of the Intel IA32

Eight new 64 bit general–purpose registers were added; they were named R8 thru R15. These new registers are true general‐purpose registers, because they do not have a specialized task like old 32‐bit eight registers EAX, EBX, etc. The EAX, EBX, ECX and EDX registers were modified having an “R” prefix to support the 64‐bit extension. The rest of registers such as index registers, RSI, RDI, and the stack pointers RBP and RSP have been modified too [11]. All 64‐bit registers continue to use the same division scheme of old 32‐bit registers that allows them to be used for 32‐bit, 16‐bit and 8‐bit operations. Like 32‐bit registers division scheme to the least significant bits from RAX register are used to 8‐bit or 16‐

bit operations designated as “uniform byte‐register addressing”[5]. The old EIP was modified to a new 64‐bit instruction pointer (RIP), allowing like this to access at any 8 bytes of data stored in memory. Figure 1 shows the new structure of the registers; alterations are displayed in purple colour.

Figure 1: New structure of the General purpose registers and XMM registers; Source [1]

Also added were eight new XMM registers for SIMD instructions (MMX, SSE, SSE2 and SS3) accomplishing a total of 16 XMM registers, but they continue to be 128‐bit wide. They support the storage of two 64‐bit floating‐point numbers in a same register, and they are used most of time by multimedia instructions that use several calculations with real numbers.

The control registers was modified to allow enabling and disabling of EM64T extend features, by adding what Intel calls MSRs (Extended feature enable MSR or IA32_EFER). This new feature contains some control bits (bit 10 and 8). Figure 2 shows each bit function in the 64bit register. The IA32e mode enabling and disabling mechanism is explained in controlling IA32e mode.

Figure 2: Extended feature enable MSR. Source [8]

(4)

Default values of 64‐bit for addressing size and 32‐bits for operands size are used in 64‐bit mode. The defaults values are changed just if it is necessary, for instance if an instruction of 32‐bit is executed. There is a new prefix called REX in the new instructions that manage the operand‐size and addressing‐size. However not all instructions need this REX field, being used just if the instruction uses a 64‐bit operand.

A RIP (relative instruction pointer) was added and this provides a new address method relative at the instruction pointer position in the stack. This means, the address of one instruction can be composed by adding a value to the instruction pointer address. This addressing mode uses a signed 32‐bit displacement that allows an offset range of ± 2GB from instruction pointer address [8].

Operating Modes

There are two distinct operation modes available in EM64T, legacy mode and IA32e.

This last one includes two sub‐modes, 64‐bit mode and the compatibility mode.

Legacy mode: In this mode the processor just works like a normal IA32 processor, running only 32‐bit applications. The processor may operate in three different operating sub‐

modes like protected mode, Real‐address mode and System management mode[10].

IA32e mode: This is the new mode that was added by Intel and gives the possibility to run a 64‐bit operating system while still being able to run unmodified 32‐bit applications.

The 32‐bit applications continue to use only an address space with a maximum of 4 GB.

Compatibility Mode: maintains binary compatibility with 16‐bit or 32‐bit applications. It allows the applications to coexist under a 64‐bit operating system. To execute existing 16 and 32‐bit application the operating system change their code‐

segment descriptor CS.L bit to 0 [8].

64bit mode: the default address size is 64‐bit and the default operand size is 32‐bit. This mode permits the full memory advantages of a 64‐bit solution, but only 64‐bit applications will work. This option is enabled by the operating system passing all registers and instruction pointer to 64‐bit, and the applications will have access to the full physical memory range. Arithmetic and logical operations, memory‐to‐registers and register‐to‐memory operations are directly supported for 64‐bit integers. A larger virtual address space and a larger physical address space can be used. A total of 2⁶⁴ or 16 exabytes can be addressed. There are some limitations on actual EM64T processors because they have only 36 address lines, that means 2³⁶ or 64 GB of RAM can be addressed. The Xeon DP is slightly different because it has a 40 address lines, allowing 2⁴⁰ bytes of addressing space[5‐7;9].

(5)

Controlling IA32e mode

As it was already referred, the operation of 64‐bit mode and compatibility mode are governed by various control‐bits in the Extended Feature Enable Register (IA32_EFER) MSR and CS descriptor [8]. The IA32_EFER.LMA controls the legacy mode or IA32e activation and code segment‐descriptor bits (CS.L and CS.D) are used to control the sub operating mode 64‐bit mode and the compatibility mode. Figure 3 shows the different CS.L and CS.D conjugation that control the IA32e modes. If CS.L=1 and CS.D=0 the processor is running in 64‐bit mode and the default operand size is 32‐bit and address size is 64‐bit. The compatibility mode is activated when CS.L=0 and the CS.D controls the operand and address sizes to 32‐bit or 16‐

bit[8]. The LMA switches the legacy mode and the IA32e mode.

Figure 3: EM64T Processor modes; Source [8].

Benchmarks

Figure 4 presents a test of a Pentium 4 processor with and without active 64‐bit extensions. This benchmark is online in Hardware.Fr website [2]. They studied different functions of encode, decode, visual effects and calculations to compare the EM64T technology with old IA32 legacy mode. The performance is observed by measuring the time, in seconds, of a task (lower values correspond at better performance).

Figure 4: A benchmark test in a Pentium 4 660 with em64T extensions activated and deactivated. Source [2].

(6)

As it can be seen in the figure, the gain of performance is not expressive and in some functions the IA32 legacy mode exceeds the EM64T. That can be seen the increase of performance in EM64T is higher in image and video applications where is accomplished a larger floating‐point calculations and integer calculations, so a larger number of general‐

purpose register and the enlargement of XMM registers helps in this increase of multimedia performance.

Conclusions

The EM64T seems to be the ideal technology for a progressive transition of 32‐bit to 64‐bit applications. Due to majority of the applications being still of 32 bit, this allows software developers a larger period of time to port their applications to 64‐bit. The 64‐bit applications can run simultaneously with 32‐applications without recompilation.

The benchmark test do not exhibit a significant increase of performance with active EM64T extension, but there are other agents that influence the performance of a processor like Front‐side bus, Pipeline levels, cache, etc. As the tests took place with the same processor these should be taken into account to demonstrate the small performance increase of EM64T.

Perhaps this technology can be more efficient when it is used with database systems because the larger addressing space allows managing a larger amount of data. My personal opinion is that, the main advantage of using a processor with EM64T extension is the possibility to run 32‐bit applications and 64‐bit applications at same time, avoiding the necessity of acquisition of two different processors, a 32‐bit support processor and a 64‐bit support processor.

Em64T allows Intel to take advantage of the existing IA32 architecture, avoiding the development of a new architecture to support both 32‐bit applications and 64‐bit applications decreasing the cost of development of a new architecture.

As the cluster SEARCH only has Xeon processors, most of problems associated with 32‐

bit/64‐bit compatibility issues can be solved. Moreover, as the cluster is accessible to scientific community of University of Minho, it is valuable to have a system that supports both 32‐bit and 64‐bit applications.

References

[1] http://www.xbitlabs.com/articles/cpu/print/core2duo‐64bit.html. 24‐7‐2006.

[2] http://www.hardware.fr/news/lire/25‐02‐2005/#7320. 2007 [3] http://en.wikipedia.org/wiki/64‐bit. 2007

[4] http://en.wikipedia.org/wiki/EM64T. 2007

[5] http://www.hardwaresecrets.com/article/262. 2007

(7)

[6] David Watts and Robert Moon. IBM Eserver xSeries 366 Technical. 2005. IBM Redbooks Paper

[7] Garima Kochhar, Kalyana Chadalavada, Amina Saify and Rizwan Ali. (2004) BLAST on Intel EM64T Architecture. Dell.

[8] Intel. (2007) Intel® Extended Memory 64 Technology Software Developer's Guide Volume 1.

[9] Intel. (2004) The 64‐bit Tipping Point. Intel® Solutions.

[10] Intel Corporation. (2006) Intel® 64 and IA‐32 Architectures Software Developer's Manual.

[11] James Leiterman. (2005) 32/64‐BIT 80 x 86 Assembly Language Architecture.

[12] Jerry Haigh. 64‐Bit Computing Solves the World's Most Complex Problems. 1996.

[13] John Coombs and John Fruehe. (2004) Planning Considerations for Intel Extended Memory 64 Technology on Servers and Workstations. Dell.

[14] Ramesh Radhakrishnan, Jimmy Pike and Skipper Smith. (2004) INTEL® EXTENDED MEMORY 64 TECHNOLOGY (EM64T). Dell.

64­bit computing

Moving from 32 to 64 bits while maintaining compatibility

Abstract.

Introduction

64­bit computing

EM64T technology

Enhancement of the Intel IA­32

Operating Modes

Controlling IA32e mode

Benchmarks

Conclusions

References

64bit computing

64bit computing

Enhancement of the Intel IA32