|August 5th, 2007, 11:59 PM||#1|
TTR's Guide to Determining Optimum Overclocked System Performance Points
thetechrepository's Guide to Determining Optimum Overclocked System Performance Points -
Properly Balancing Processor Frequency, Front Side Bus (FSB) Transfer Rates, and Memory Speeds & Timings
Date: 12 August, 2007
Author: Kristopher Boughton
Motherboard designers, memory manufacturers, and proprietors of performance hardware have been working hard to make modern system overclocking absolutely easy. Intel - central to all of this and largely responsible for what we here at The Tech Repository would call the "Golden Era of Overclocking" - is moving faster than ever in a effort to openly and publicly embrace the online enthusiast community. Even though this rush of progress is truely satisfying, the flash of recent success has left a somewhat rather large void of comprehension. Like with so many things in life, there must be a distinction between mere accomplishment and fundamental understanding. If your overclocking goals include nothing more than the desire for the absolute highest CPU clock speed possible then this article isn't for you. But if you want to know more about what really makes a great performance system, then we invite you to read on. The satisfaction of true accomplishment awaits.
When we last talked about memory we ended our investigation on a rather incomplete note. As you may recall, we lightly explored memory frequency and timings - focusing mainly on one of the more domineering specifications, CAS - and noted how several settings must co-exist in establishing what can prove to be a rather significant factor in overall system memory access latency. If you're not familiar with the concept of converting a relative memory timing rating to an actual time delay then you should review our earlier post (linked above) before continuing. Much of this article will build upon this seemingly innocuous abstraction.
The Effect of Memory Frequency on CAS Latency
Figure 1 provides a graphical representation of how CAS latency, represented in nanoseconds (ns), changes as memory frequency is varied. Of course, the location of the jump discontinuities (vertical step increases in CAS latency) are entirely dependent on the minimum CAS latency of interest. Calculating CAS latency is in fact, simple - the formula is shown below - be sure to use DDR memory speed (MHz) and not base memory speed. Study has shown that that regardless of memory speed and CAS latency specification that Value Series memory is usually binned for 10.5 - 12.0 ns of CAS latency, Mainstream memory is often about 8.0 - 10.5 ns and Performance Series memory can be found to be as low as 7.0 - 8.0 ns. Knowing this, it can be surprising to realize that what may appear to be an insignificant decrease of as little as 2.5 ns of Column Address Strobe (CAS) access latency can end up costing the consumer as much as an 80% premium per GB!
Mathematically, calculating CAS latency from CAS selection and memory frequency (remember, use the DDR specification frequency, not the base memory speed) is easy. This simple expression, Equation 1, is all that's needed to form graphs like those of Figure 1 and Figure 2.
In Figure 1 shown below, the minimum CAS latency of interest is established at 7.5 ns (with a maximum of 8.5 ns) - meaning that whenever CAS latency reaches this value the CAS is increased by one (instantaneously and dramatically increasing the true CAS latency). We then investigate values from 7.5 to 8.5 ns - this frames our target memory overclock window (shown with green arrows). The red arrows draw our attention to windows of relatively lower memory performance (with respect to access latency), or areas of memory operation where given the memory frequency and CAS data pair we would be unduely limiting the memory's performance potential. The tighter we make our "CAS Access Latency Window of Interest" (CALWI) the smaller each "Target Operation Band" becomes (and the larger each "Reduced Memory Performance Band" becomes).
What Does All of This Mean?
1). Points located along a particular horizontal latency line have equal CAS latency regardless of frequency and CAS setting. For example, in this particular model, DDR-1066 at CAS 4 has the same exact CAS latency as DDR-1333 at CAS 5 and DDR-1600 at CAS 6, etc.
2). If increasing memory frequency forces CAS latency too low (which can result in an overclock failure or which may require an unacceptably high memory operating voltage), the next higher CAS setting must be selected. This can result in a higher than necessary true CAS access latency. For example, looking at Figure 1 above we can see CAS latency at DDR-1150 at CAS 5 is inferior to that of DDR-1000 at CAS 4.
3). Selecting a larger potential "CAS Access Latency Window of Interest" (CALWI) now will potentially give the user more flexibility later when setting CPU frequency (multiplier x FSB) at the expense of possible memory performance. This is just one part of the balance that must be made - more on this later.
4). All operating points laying along a particular CAS latency line should require nearly the same voltage. Remember though, eventually all memory is frequency limited.
Choosing an Acceptable CAS Access Latency Window of Interest (CALWI)
Everyone's memory is different, even the two sticks included in a dual channel set are different. That's because the memory cell manufacturing process allows for individually produced memory integrated circuits (IC) to vary somewhat as long as they meet the binning standard. For example, if 15 of the 16 memory ICs will perform DDR-800 at CAS 3 with only 1.8v and the 16th IC needs 2.0v then the dual inline memory module (DIMM) will undoubtably require 2.0v for stable, error-free operation. The best memory distributers will bin memory on the IC level, which results in premium kits, but this is discussion is better left for another article.
We suggest you first start by deciding your personal applied memory voltage limit (vDDR) that you are comfortable with for consistent 24/7 operation. Again, this is a personal choice - based on experience and available cooling some users may be willing to apply more voltage than others. Pick a value that makes sense for you, later if you decide you are willing to go higher you can always load up the tool we have provided and re-evaluate.
For example, let's say your memory is warrantied for 2.25 +/- 0.05v and you refuse to go any higher (a wise decision). Next, through trial and error, find the lowest CAS latency possible with your maximum memory voltage (for the purposes of this discussion let's assume DDR-857, CAS 3 at said 2.3v). Using the forumula provided above we quick determine that this correlates to a CAS latency of 7.0 ns. The upper limit is easy - a smaller CALWI will keep your final system operating point closer to the memory maximum while a larger CALWI will allow you the freedom to sacrifice memory performance for possible CPU speed increases later. We recommend a CALWI of no greater than 1.0 ns. For those serious about memory performance, 0.5 ns.
In Figure 2 the CALWI has been both shrunk and lowered to 7.0 - 7.5 ns. This in a sense creates new ground for improved memory performance as some DDR frequencies that were previously "achievable" at CAS+1 are now possible at CAS. For example, DDR-1100 at CAS 5 is now possible at CAS 4 (as long as the target latency window is based in reality). Later on in this article we'll introduce you to an easy-to-use tool that does this work for you.
A Better Method for Rating True Memory Performance
Memory 'X' is rated as PC-8500 (DDR-1066), CAS 5 at 2.3v. Memory 'Y' is rated as PC-6400 (DDR-800), CAS 4 at 2.1v. Which one is better? There are two ways to directly compare this memory on a performance basis: Method A - based on rated speeds, timings and voltages, or Method B - based on empirically collected data. While the first method can give a good initial reference point for comparison, a more detailed analysis must be made for accurate ranking.
For this purpose we've developed the Universal Memory Performance Rating (UMPR). Shown below, the formula is simple and straight-forward. More importantly UMPR depends only on CAS latency, given in nanoseconds, and memory voltage (vDDR) required for operation at said CAS latency. Because of this we find that this new method for memory rating removes the dependency and frustration of trying to compare memory based on dissimilar PC-ratings/CAS/voltages.
Combining this formula with that for CAS latency results in the final form (Equation 2) with memory frequency, CAS selection and memory voltage as variables for calculation. This may be more convenient as there is no need to first calculate CAS latency. The coefficient of 10,000 was initially selected in order to ensure UMPR was a whole number, greater than one (1), but not unduely large.
Let's now revisit our initial question: Which memory is better? Method A, which relies only on manufacturer specifications, would yield the following results: Memory 'X' - about 463 / Memory 'Y' - about 476. Although similar, there is a clear winner (476 > 463) and the results may be surprising. We must strongly caution the use of Method A though as memory specifications are rarely based on maximum performance potential.
Method B is exactly the same in theory, the only difference being the determination of a true CAS latency/voltage limit for the memory being examined. You should find that using real data always yields more accurate results.
Memory vs. Processor Performance - Balancing the Common Factors
The first page of thetechrepository Overclocking Guide provides the user with a pair of tools intended to help introduce a few concepts used throughout the remainder of the guide. The first of the tools is used when maximizing processor frequency is the main goal. The second of the tools is used when minimizing CAS latency (and as we will learn later, overall system latency) is the user's priority.
Pay attention to the direct of the arrows in the screenshots below - these indicate which fields influence other fields directly. Red lines are primary goal relationships while blue lines are secondary goal relationships. Dark gray lines are informational only. Additionally, red dashed outlines indicate user-changable fields, all others are read-only.
Maximum Processor Frequency Goal Seek Tool
When maximizing Processor Frequency the primary considerations are simple: Front Side Bus (FSB) and CPU Multiplier. The red lines illustrate this relationship, in fact, FSB x CPU Multiplier = CPU Frequency. In this model DDR Specification is solely dependent on Front Side Bus (FSB) and Memory Ratio (Divider) and nothing more. Picking a CAS Selection then results in a final computed CAS Memory Access Latency. (The formula for determining CAS latency was shown previously.) Using this tool it should become clear that CAS memory access latency is artificially bound when dialing-in a target CPU speed. At this point choosing the best memory ratio (divider) and CAS selection is all that's left to do - you'll find that there may be a best memory divider/CAS selection combination that put you as low in your CALWI as possible without going under it.
Note: If you do find that CAS memory access latency absolutely must be forced lower than you were prepared to go you have only one last choice - increase memory voltage above your previously established comfortable maximum. Beware though, as higher than rated memory voltage not only voids the manufacture's warranty but can also result in decreased memory life and premature component failure.
Minimum CAS Latency Goal Seek Tool
There are multiple solutions to a minimum CAS Memory Access Latency in the selected CALWI - one for each CAS Selection. A range of DDR Specification memory frequencies for each CAS selection will in fact bound the CALWI. Optimize for CAS latency by selecting a DDR specification/CAS selection pair that puts CAS memory access latency as low in your CALWI as possible. Changing the Memory Ratio (Divider) now modifies Front Side Bus (FSB) while keeping DDR specification constant. Processor Frequency is then indirectly altered making step changes in CPU speed possible only by moving the CPU Multiplier up or down. This tool demonstrates how tuning memory frequency for maximum performance can result in less than optimal CPU performance.
thetechrepository Overclocking Guide for Dual- and Quad-Core CPUs
As with the first tool, user changable boxes are indicated using red dashed cell borders, all others cells are read-only. Bold fields are static - they never change. When you first start the tool in MS Excel you'll find default values are set for a CALWI of 7.0 - 7.5 ns.
General Instructions for Tool Usage (Figure 5):
Operating thetechrepository Overclocking Guide - A First Look
Shown in Figure 6:
Shown in Figure 7:
Questions? Comments? Want to share what you know? Talk to us here.
|Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)|
|Thread Tools||Search this Thread|