|
Benchmarks and Benchmarking Systems:
As the saying goes: there are lies, damn lies and then
there are benchmarks.
Benchmarks can be useful and meaningful, if and only if we understand what they are measuring.
Not all processors are created equal! Some have very little
cache. Some cost too much. Some expend way too much power as heat instead of processing your
application. Some processors provide no upgrade path or backward compatibility, requiring that programs
be re-compiled or needing new versions of software to take advantage of the promised performance
increase .
Work completed per clock cycle is more important than the
Megahertz rating!
At Integrated Solutions and Systems we test every new processor,
motherboard and memory architecture to see if there really is a performance increase in newer products.
We also test for system stability running commonly used applications and benchmarks to make sure that
the systems we provide are stable.
There are very compelling reasons to
consider using AMD based workstations and servers versus Intel based
workstations and servers. They both run 32 bit and 64 bit operating
systems and applications but:
- AMD has integrated the memory
controller into the processor and therefore can run the memory
controller at the same speed as the processor. Faster processor,
Faster memory controller - NOT FSB limited.
- AMD has implemented a dedicated
interface (1 of 3 on chip HyperTransports) just for inter-processor
communications. The memory controller on each processor is
independent of the inter-processor HyperTransport - NOT FSB limited.
- AMD's architecture implements a
separate memory bank for each processor. Two processors equals two
memory banks which equals twice the memory bandwidth compared to
Intel's dual processor architecture.
- AMD's single processor,
dual core
architecture on a workstation is almost identical to Intel's dual
processor Xeon architecture on a server - except that AMD's single
processor, dual core architecture still allows the memory controller
to run at the same speed as the processor. This single processor,
dual core architecture with integrated memory controller allows for
faster memory access as you increase the speed of the processor - it
is NOT FSB limited. Plus, all the system I/O uses a separate
HyperTransport bus, NOT the common FSB limited external memory
controller's bus.
- Another factor that is becoming more
important is processor power dissipation. You do the the math:
- AMD Single Core Athlon 64
Processors dissipate 89 Watts Maximum (All Parts)
- Intel Single Core Pentium 4
Processors dissipate 115 Watts (3.4 GHz, 3.6 GHz and 3.8 GHz
Parts)
- Intel Single Core Xeon
Processors dissipate 111 to 120 Watts (Not clear on data sheets)
- AMD
Dual Core Athlon 64 X2
Processors dissipate 110 Watts Maximum (All Parts)
- AMD
Dual Core Opteron Processors
dissipate 95 Watts Maximum (Socket 940)
- AMD Dual Core Opteron Processors
dissipate 110 Watts Maximum (Socket 939 - New 1XX Series)
- Intel Dual Core Pentium D
Processors dissipate 130 Watts (3.0 GHz and 3.2 GHz Parts)
- Intel Dual Core Xeon 2.8 GHz
Processors dissipate 150 Watts Maximum
An argument can be made that an AMD
single processor, dual core architecture based system will give the
same or better performance than an Intel Dual Xeon single core based
system but the AMD based system will dissipate somewhere between 112
and 121 watts less, running cooler, doing the same job, in the same
case, with the same peripherals. The cost savings due to reduced
electrical usage and reduced cooling requirements could be
significant even if you are only running 8 machines.
Below are benchmark comparisons for today's higher performance
processors:
AMD Athlon 64 X2 versus Intel Pentium 4 D
SPECint_rate2000 Benchmark:
Opteron 175 = Athlon 64X2 4400+ (2.2 GHz, 2 x 1M L2 Cache - Dual Core)
Opteron 180 = Athlon 64X2 4800+ (2.4 GHz, 2 x 1M L2 Cache - Dual Core)
Pentium D 840 (3.2 GHz, 2 x 1M L2 Cache, 800 MHz FSB)

To understand a little more about why these two differing "x86"
architectures provide different levels of performance see our
performance web page. It explains in
detail how a less expensive system with a "slower" processor can provide
higher overall performance.
AMD Opteron 2XX versus
Intel Xeon DP Single and Dual Core Comparisons
SPECint_rate2000 and SPECfp_rate2000:
A processor that can execute
more instructions per clock cycle can run "slower" and provide the same
or better performance as a processor that executes fewer instructions per clock
cycle running "faster".
In simple terms, a system
with processors and motherboard that implement a better architecture,
such as:
-
4x to 6x larger level 1
caches
-
Faster cpu to memory
controller bus or a processor integrated FSB,
-
Wider FSB and memory bus -
128 bits
-
An Exclusive NON-shared FSB
on each Processor
-
An Exclusive NON-shared
memory bank on each processor
Can easily execute many more
instructions per clock cycle.
We compared two systems
recently. One was a Dual Opteron 254, 2.8 GHz, 128K Level 1 Cache, 1 MB
Level 2 Cache, 1 GB (4 x 256MB) DDR-400 Registered ECC Memory, Windows
XP Pro with SP2 and the other was a Dual Xeon 3.6 GHz Nacona with 800
MHz FSB, 28K Level 1 Cache, 1 MB Level 2 Cache, 2 GB (2 x 1GB) DDR2-400
Registered ECC Memory, Windows XP Pro with SP2.
We ran the same version of
the free benchmark program Sandra from
Sisoftware on
both machines. Here are the results:
Intel Dual Xeon 3.6 GHz System CPU Benchmark:

AMD Dual Opteron 254, 2.8 GHz system CPU Benchmark:

Intel Dual Xeon 3.6 GHz System Memory Benchmark:

AMD Dual Opteron 254, 2.8 GHz system Memory Benchmark:

Sandra Benchmark Results:
MIPS / MFLOPS
Intel Dual Xeon 3.6 GHz CPU Benchmark: 20818
Integer, 8780 Floating Point, 14445 SSE2 Floating Point
AMD Opteron 254 2.8 GHz CPU Benchmark: 23611 Integer, 8862 Floating
Point, 11478 SSE2 Floating Point
Mbytes / Second
Intel Dual Xeon 3.6 GHz Memory Benchmark:
2701 Integer, 2689 Floating Point
AMD Opteron 254 2.8 GHz Memory Benchmark: 11393 Integer, 11433 Floating
Point
The differences in processors to memory controller to memory versus dedicated memory controller
per processor and separate memory banks on each processor quickly become
self evident. Whether your application will be effected is dependent
upon how your application utilizes system resources. See the
Performance
page to understand how
system architecture trade-offs effect application performance. If you have questions about relative
performance of different processor / system combinations and would like to discuss how this will effect
your application please call us.
 |