Dual-Core Intel Xeon Chips Now Available
Original Article Date: 2006-07-05
The long awaited release of the Intel Dempsey/Woodcrest Dual-Core
Xeon processors took place just a couple of weeks ago.
At the recent Intel Channel Conference, I learned all about these new
chips, the architecture behind them, how they handled the high-bandwidth memory
requirements for servers and workstations, the new chipsets and platforms that
will support them and their new lower power consumption.
I'm always a little skeptical about Big & Exciting New Releases from
manufacturers, believing that half the time they're just an excuse to sell new
stuff for more money, simply repackaged. But with the launch of the dual-core
Xeon, I was genuinely impressed. This is perhaps the single biggest
overhaul of Intel architecture in a very long time. In order to
understand why I think this is a revolution in server chips, let us first
address the shortcomings of the previous Intel Xeon generation.
Xeons From Pentium III to Irwindale - the RAM Bottleneck
The current single-core Nocona/Irwindale Intel Xeons, were the
last in a long line of Intel server CPUs going back to the original "Pentium
III" Xeons of the late 1990s. The primary7evolution of these chips was improved
chip clock speed, RAM front-side-bus (FSB) speed and increased L2 cache size.
The chip and platform architecture remained essentially unchanged - all the
chips were single core, and the two CPUs accessed RAM through a single
memory controller hub (MCH) via the common front-side bus.
This architecture worked relatively fine until dual core came around. When you
double the number of CPU cores on a chip, you basically double the RAM
bandwidth required by those cores, and in a dual-socket system, this amounts to
effectively four CPUs placing demands on the system
memory through a single common front-side bus. The result is
data saturation of the bus - a bottleneck.
Dempsey has Dual-Independent Bus
The new dual-core Xeon platform features Dual-Independent-Bus (DIB),
which means that each CPU socket has its own direct and independent line to the
MCH and RAM. Add this to the fact that the new MCHs will support a maximum
FSB speed as high as 1333MHz (up from 800MHz) and you can see that each
dual-core socket has plenty of headroom on memory bandwidth.
One could still argue that the AMD Opteron model of placing the MCH onto the
CPU die is advantageous, in that it provides a very short and
direct physical path to the MCH, but there is a downside to this. By keeping
the MCH separate from the chip, it allows you to make upgrades to the MCH
without having to release a new chip version. This has, I believe, held back
AMD moving from DDR to DDR-2 sooner than they would have liked.
But I digress. In short, Intel have addressed the bottleneck problem by
providing two FSBs to the MCH, whilst still leaving the MCH off-die, for better
future upgradeability.
FB-DIMMs bring ultimate flexibility and scaling to Server RAM Populations
Another RAM problem that plagued the previous generation of Xeon CPUs was
populating the available DIMM slots of a motherboard. In the late 1990s,
overall system RAM requirements were small, and DDR was designed back then,
when the average DIMM size was at most, maybe 32MB. Since then, of course, the
average DIMM has grown to 512MB or 1GB, with 2GB modules not uncommon. The
problem with the increase in size, whilst still using the old DDR model has
been the proliferation of ranks.
A rank is a term that applies to a registered DIMM, and affects
Xeon and Opteron based systems. It refers to a certain number of DRAMs
(the actual RAM chips that sit on the DIMM) that can be accessed simultaneously
by the register chip on the DIMM. A rank is comprised of
either x4 (meaning four bits wide) or x8 (eight
bit width) DRAMs, and when added together the total number of bits cannot
exceed 64-bits. With x4, this usually means that a double-sided DIMM will be
comprised of 16 x4 DRAMs - giving 64-bits data width in total, so in this case
the DIMM would be single-rank. As pressure for larger and larger DIMM
sizes has come from server users, RAM manufacturers began using cheaper x8
DRAMs, but there was still a 64-bit limit per rank. So DIMMs became "dual-rank"
with 8 x8 DRAMs on each side of the DIMM.
This was fine, until the user wanted to start adding LOTS of DIMMs to their
server, as the mainboard MCHs had limits to the maximum number of ranks
that could be addressed server-wide. So when the user wanted lots of RAM
cheaply, he would use x8 dual-rank modules, but would hit the MCH's rank
limitation twice as quickly. The only alternative was to go to the more
expensive x4 single-rank modules.
FB-DIMMs (Fully Buffered DIMMs) completely address this
problem, by freeing the MCH from the hassle of dealing with individual DRAMs
and DIMM register chips. Most of the nitty-gritty on an FB-DIMM is addressed by
the AMB (Advanced Memory Buffer) that sits on the DIMM. In
"outsourcing" the DRAM management functions to the DIMM, the MCH is no longer
restricted to a maximum number of ranks system-wide, allowing the user to fill
his or her server with more numerous x8 DIMMs of smaller individual capacities.
One big difference that you will immediately notice on the new Dual-Core Xeon
mainboards is the larger number of DIMM slots available (sometimes as many as
16!). With rank limitations gone, a user can make up 16GB from 16 cheap 1GB
modules instead of 8 pricier 2GB modules, and so overall cost to
the user is reduced.
The great thing about FB-DIMMs also is that they use standard DDR-2 DRAMs.
The only changes is the replacement of the old register with the AMB. This
means that there is no significant price premium, since commodity DDR-2 modules
are used.
Power for the People
Intel have finally addressed the power issue. For the last few years now, AMD
have always been exploiting their lead over Intel with Performance per Watt.
They even ran a billboard campaign recently, advertising how a certain Bay Area
city could power all their PDAs and cell phones on the power saved from the
world's servers running Opteron over Xeon.
But with the new dual-core Xeons, it seems as though Intel has at last turned
the tables. The new dual-core chips actually use less power than the
previous single-core chips. When you consider that the new chips
have two cores, and twice the performance, then you're looking at a power
reduction per core of over 50%.
They've done a good job of addressing this issue, and rightly so, since for many
years data center owners have been complaining to Intel about their power
bills. Remember that a server is on 24/7, 365 days a year, so every watt saved
significantly reduces the server's TCO. And in a data center, every extra watt
burned means you need at least an extra watt again in air-conditioning.
So the savings on reduced power profiles mulitply.
Our New Dual Core Systems - Double Performance, Same Price!
I'm almost forgetting the best part. The new Xeons have twice the computing
power of the old ones, but at the same price! If you compare
the price points of the old single-core and the new dual-core Xeons at any
given clock speed, you'll find they're about the same.
So we can now build dual socket servers with effectively four CPUs for the same
price as two! Your servers can carry 100% more users, or your workstations
compute twice as much data, for free!
So I hope you're as excited as I am about the new dual core Xeons. I think that
Intel taken the lead once more in the server market.
An Intel server now means twice the processing power as before, with no extra
power overhead, no more memory bottlenecks, and a highly flexible DIMM system
that allows the user to configure memory almost limitlessly. And all this for
the same price as the previous generation.
Sounds like something for nothing? You bet!
Best regards,
Ben Ranson
Chief Systems Engineer
|