Saturday November 21, 2009 8:48 PM AEST

Overclocking and the latency game

  • Email a Friend
  • Print Page
Overclocking and the latency game

Analysis: Not all Cores are the same, it appears...

Our recent story on possible superdesktops using the upcoming Nehalem-EX and Nehalem-EP processors also hinted at their expected overclocking abilities, but what kinds of speeds might we realistically be able to achieve with these monsters?

After all, the 32nm Gulftown Nehalem-EP dualie should start at 3.6GHz for its six (6) cores per die, and will most probably run well above 4.5 GHz when properly cooled on a good mainboard in most cases. As for the biggie eight (8) core Nehalem-EX, even then the 2.66GHz top default clock should be arousable to some 3.2GHz under the right conditions. So, these multiprocessing brethren of the otherwise similar Core i7 should share its overclocking margin too.

Sounds easy, right? However, what would we really be overclocking there? Not everything, it seems. Even on the current Core i7, as you know, the default clock you see only applies to the four cores and their L1 and L2 caches. The shared 8MB L3 cache, the memory controller and the QPI interface, all collectively known as "uncore", have their own clock, which is asynchronous at that - read, more latency in between. This arrangement enables you to have, among other things, better overclocking for the "core" portion, but the latency and, to certain extent, access bandwidth to the uncore are sacrificed a bit. And, before you start bias-ranting, AMD was equally guilty of this with its Barcelona and Shanghai processors and its desktop Phenom CPUs too.

So, your glorious overclocking achievement may show the 4.00GHz on screen, but the L3 cache and memory controller inside the CPU might only be working at 2.66GHz if it's using DDR3-1333 DRAM, at double or more the memory data rate. Now, this may be necessary in the case of the humongous Nehalem-EX die where the 24MB L3 cache, four (4) memory channels and four (4) QPI links obviously can't run at a very high frequency, but either way your bandwidth and latency benchmarks will be affected, depending on both the "core" and the "uncore" clock rates.

On the desktop Core i7 running at 3.33GHz and running the Sandra 2009 latency test, the L1 cache may show 4-cycle latency compared to 3-cycle latency for the same-sized Core 2 cache, while the small L2 256KB cache will show 10 cycles, and the big shared L3 8MB cache block, anywhere from 37 to 46 cycles depending on the, yes, "uncore" clock - as you can see on this SiSoft Sandra 2009 shot. Now, the Core 2 large L2 cache of 12MB - two times 6MB, on two dies of course - shows just 16 or 18 cycles latency if staying within each dual-core die on the two-die MCM.

As reported elsewhere on the web, due to process and design improvements the mainstream version of the upcoming Sandy Bridge 32nm CPU should have somewhat improved latencies for the same cache structure as the current Core i7. The 32K L1 cache will be back to 3 cycles, the 256K L2 cache down to 9 cycles, and the 8MB L3 cache at 25 cycles - not bad for a cache shared between four CPU cores at the same time! This is a Core i5 follow-on, the higher end CPUs will have more cores, larger caches and possibly slightly larger latencies.

In summary, there's more to it than the clock numbers alone. Even within the same product family, subsequent steppings may have different design compromises to achieve the desired goals, some of them not widely known. And, as the CPUs become more complex, not just with differently-clocked async parts but also in various generations of "turbo" auto-overclock settings, one clock frequency number won't be sufficient to describe the speed anyway. How about, say, Core i5 XXX, core 3333 MHz, uncore 2667 MHz, turbo 3600 MHz, for instance?

 

theinquirer.net (c) 2009 Incisive Media

 
The latest issue is on sale now!

Want to learn all about Diablo III? Want to find out what the best Solid State Drive is on the market today, and how to look after it? Want to catch up on the latest hardware, games and in depth tech from Australia's best enthusiast mag?

Get your copy today :)
1 Comment
Thoughts on this article? Add a comment below.
RaRaDawg
Jul 7, 2009 6:57 PM
o_O"?!
Thanks for the facts...

Lol @ AMD Processors...named after cities XD
Login or register to submit a comment.
 
 
 
Atomic Magazine

Issue: 107 | December, 2009

Atomic is a magazine aimed squarely at computer enthusiasts, gamers, and serious PC upgraders.

Every month we bring you the latest reviews of new technology and PC components, in depth features on everything from overclocking to console hacking, and gaming previews and interviews.
 
Latest Comments
"Yeay! :D It's good to see someone with their head screwed on properly."
by colganaitor | Nov 21, 2009 7:20 PM
 
"Holy shit, batman.

*runs"
by colganaitor | Nov 21, 2009 7:17 PM
 
""sudo preupgrade"
...failed to download installer metadata
------------
So ..."
by wlayton27 | Nov 21, 2009 8:16 AM
 
"I thought Vista outlived it's usefulness about the same time it was released , lol"
by mr.gargoyle | Nov 21, 2009 12:28 AM
 
"^ I find with CoD4 that I can jump on an empty server and be joined by 6-12 others before the ..."
by Ezekill | Nov 20, 2009 10:10 PM
Latest User Reviews
Shenmue II
10%
asdfasdf
 
EVGA X58 Classified
90%
great board, a few things could be better
 
EVGA X58 Classified
90%
Gorgeous looking
 
Sapphire 4890
90%
So good, I immediately wanted a second one!
 
MSI 790FX-GD70 motherboard
90%
Allmost the prefect gaming board