Saturday February 11, 2012 3:32 AM AEST

Inside Cache

By Ashton Mills
15:37 Jul 28, 2008
Tags: cache
«  »
Inside Cache
L1
The Level 1 cache, or L1, is built into your CPU and so runs at the same speed as the CPU. This is an expensive prospect to do, and as a result L1 caches aren’t very large. However they are the fastest cache in your system, and your CPU loves them. More than Vegemite loves toast, even.

In modern CPUs the L1 is frequently broken down into two specific dedicated caches: the instruction cache and the data cache.

The instruction cache, as the name suggests, caches instructions before they enter the fetch/decode/execution cycle. The data cache stores the results of calculations before being written back to the L2 cache and onto main memory. Together they can reduce latency for calculations that require instructions already in the instruction cache, or data already in the data cache, saving the need to source these from the L2 cache. Note: that is a simplistic view of cache operation – branching prediction allows the instruction prefetcher to load instructions into the L1 instruction cache in advance, while the L1 data cache can also be pre-emptively filled from the L2 cache in advance for data that is expected to be needed. It’s all part of the magic, and complexity, of modern CPU design.

Finally, the L1 cache is always dedicated solely to a particular processor, or to a particular core on a multi-core processor. It’s also tailored to the architecture, and size isn’t necessarily an indicator of performance. Commonly, modern CPUs like Intel’s Core2 use 32k for the L1 instruction and L1 data caches, for a total of 64k L1 cache. AMD’s Athlon64 and Phenom lines uses 64k for the L1 instruction and L1 data respectively, for a total of 128K.


L2 and L3
If instructions and data can’t be found in the L1 cache, the next step is the Level 2 cache. A loose definition of caches in general is that the larger they are, the greater the hit rate they provide at the cost of latency. This is a description apt for both L2 and L3 caches. Both are substantially larger than the L1 cache, but incur a higher latency.

The L2 is where competing processors from Intel and AMD start to differ more as well. By way of example, processors like Intel’s Core2 come with whopping great big L2 caches (4MB for the 65nm brethren, 6MB for the 45nm) per dual-core. The cache is also shared between the cores. So on a 45nm quad-core Core2, although the beast comes with 12MB of L2 cache, you have to remember a Core2 quad is literally two dual-cores slapped together, and the 12MB L2 cache actually comprises two 6MB L2 caches that can’t be shared between the two dual-cores. By comparison AMD’s current Athlon64 and Phenom lines use dedicated L2 caches specific to each core, with 1MB cache per core on the FX-62 and 512KB per core on the Phenom.

click to view full size image


Additionally the Phenom incorporates a 2MB shared Level 3 cache, which is really just a larger shared L2, but does allow the Phenoms to benefit from super-fast L1 and L2 caches specific to the processing tasks of each core while also gaining from a fast shared cache in the L3 between cores. It is, ultimately, a good design – and it’s not surprising that Intel would appear to be learning from AMD here, as Nehalem looks to be providing smaller dedicated L2 caches per core and adding a large shared L3 cache instead when it arrives.

Shared caches have a number of benefits – they allow one core to use more than its share of cache if the other cores are using less, and they facilitate sharing of data between cores, negating the need and penalty of going to main memory.

 
«  »
 
This article appeared in the July, 2008 issue of Atomic.

Behind the scenes with Mass Effect 3! GTX 560 VGA round-up! Essential Skyrim tweaks to improve your game! Plus reviews, news, hardware, more games, and easy to following modding guides for PC builders. ON SALE NOW!
 
Latest Competitions
 
Atomic Magazine

Issue: 133 | February, 2012

Atomic is a magazine aimed squarely at computer enthusiasts, gamers, and serious PC upgraders.

Every month we bring you the latest reviews of new technology and PC components, in depth features on everything from overclocking to console hacking, and gaming previews and interviews.
 
Latest Comments
 
Latest User Reviews
Battlefield 3 is the new benchmark online FPS
90%
A very fun and realistic multiplayer ride.
 
Antec Kuhler 920 - liquid cool
90%
Antec Kuhler 920 silent but effientive out of the box no maintence water cooling kit
 
Antec's Lanboy Air - our new favourite case
90%
Antec Lan boy Air in red a very cool design
 
Antec's Lanboy Air - our new favourite case
90%
This product overall is awesome.
 
MSI's GT780 laptop as fast as it gets
90%
Nice laptop
 
 
Close Get the February, 2012 issue of Atomic mailed to you for $8.95, including postage.

Buy nowDigital Version