Wednesday, October 26, 2005
The Right Desktop Processor: CPU Price/Performance
Trying to determine the sweet spot where price and performance cross over for CPUs is a more daunting task. For one thing, the applications mix is far more complicated. People use office applications, create content for web sites and games, encode video for building DVDs, and play games. Each application mix may have different issues when it comes to determining the optimum price/performance ratio.
Take PC games, for example. The cheapest CPU available may have the best frame rate per dollar ratio. But you still need an adequate frame rate for an optimum gaming experience, and the cheapest CPU may not deliver that. On the other hand, office applications are generally not as sensitive to raw performance, and the lower cost processor may be better. It's all in what you do.
In our testing, we looked at a mix of applications, including video encoding, 3D modeling and rendering, PC gaming, office applications and web content creation. We used a mix of synthetic, applications-based and pure applications benchmarks to generate our results. The end goal is to help you determine which CPU might be the right one for your needs.
Picking Intel processors turned out to be a thornier issue. Intel's mainstream CPUs are divided into three different product mixes today: the 500 series, the 600 series, and the 800 series dual-core CPUs. The differences between the 500 series and 600 series are somewhat subtle. The 500s have 1MB of L2 cache, while the 600s have 2MB of L2 cache. It used to be that the 600 series were 64-bit capable and the 500s weren't, but even that's somewhat muddied now. Any 500 series CPU ending in "1" (the 561, for example), also have Intel's EM64T extensions.
In the end we decided to drop the 500 series entirely. There were several reasons:
- Performance differences between 500s and 600s were minimal at the same clock rate, despite the differences in cache size.
- There was substantial price overlap. In some cases, a 500 series processor running at the same clock rate as the equivalent 600 series processor actually cost more.
- The naming conventions for the 500 series are incredibly confusing. While a 570, 570J and 571 are essentially identical in 32-bit performance, the 570J has hardware NX (no-execute) bit support. The 571 adds EM64T support. None of the 500 series CPUs offers Intel's enhanced SpeedStep, which offers improved power management and reduced thermal output.
We settled on using the most current generation of Intel desktop CPUs, the 600 series Pentium 4's and the 800 series dual-core CPUs. Also, the Pentium M line is still mainly focused at mobile applications, and can't yet be considered a mainstream desktop CPU.
AMD CPU nomenclature can also be confusing. For example, the Athlon 64 3800+ runs at 2.4GHz and has 512KB of L2 cache, while the Socket 939 version of the 3700+ runs at 2.2GHz, but has 1MB of L2 cache. There's also still a Socket 754 3700+, which runs at 2.4GHz and has 1MB of L2 cache. So it appears that AMD and Intel are racing to see who can offer the most confusing product line.
On the AMD side, though, the choices were a bit easier: We focused only on mainstream, Socket 939 CPUs, leaving out the Socket 754 choices entirely, since Socket 754 is now relegated to either AMD's budget line of CPUs or its mobile Turion processors. Let's take a look at the complete lineup. Note also that we only tested the latest Rev E 90nm cores. It's unlikely that using an earlier variant will result in much performance difference, though the 130nm versions will run hotter.
Intel CPUs
Processor | Clock rate | L2 cachesize | Notes | Price |
Pentium 4 630 | 3.0GHz | 2MB | 800MHz FSB | $167 check prices |
Pentium 4 640 | 3.2Ghz | 2MB | 800 MHz FSB | $214 check prices |
Pentium 4 650 | 3.4GHz | 2MB | 800 MHz FSB | $273 check prices |
Pentium 4 660 | 3.6GHz | 2MB | 800 MHz FSB | $395 check prices |
Pentium 4 670 | 3.8GHz | 2MB | 800 MHz FSB | $627 check prices |
Pentium 4 Extreme Edition | 3.73GHz | 2MB | 1066MHz FSB | $1,025 check prices |
Pentium D 820 | 2.8GHz | 2 x 1MB | 800MHz FSB; no support for Enhanced SpeedStep or Hyper-Threading. | $247 check prices |
Pentium D 830 | 3.0GHz | 2 x 1MB | 800MHz; no support for Hyper-Threading. | $326 check prices |
Pentium D 840 | 3.2GHz | 2 x 1MB | 800MHz; no support for Hyper-Threading. | $506 check prices |
Pentium Extreme Edition | 3.2GHz | 2 x 1MB | 800MHz; Hyper-Threading Enabled; multiplier unlocked. | $995 check prices |
Note that front-side bus clocks are effective clocks; the actual clock rates are 1/4 the effective rate (e.g., 800MHz FSBs are actually clocked at 200MHz).
AMD CPUs
Processor Clock rate L2 cache Notes Price Athlon 64 3000+ 1.8GHz 512KB $146
check prices Athlon 64 3200+ 2.0GHz 512KB $189
check prices Athlon 64 3500+ 2.2GHz 512KB $215
check prices Athlon 64 3700+ 2.2GHz 1MB $260
check prices Athlon 64 3800+ 2.4GHz 512KB $248
check prices Athlon 64 4000+ 2.4GHz 1MB $368
check prices Athlon 64 FX-55 2.6GHz 1MB Multiplier unlocked $805
check prices Athlon 64 FX-57 2.8GHz 1MB Multiplier unlocked $990
check prices Athlon 64 X2 3800+ 2.0GHz 2 x 512KB $345
check prices Athlon 64 X2 4200+ 2.2GHz 2 x 512KB $470
check prices Athlon 64 X2 4400+ 2.2GHz 2 x 1MB $520
check prices Athlon 64 X2 4600+ 2.4GHz 2 x 512KB $688
check prices Athlon 64 X2 4800+ 2.4GHz 2 x 1MB $880
check prices
CPU pricing used in our study was determined by using the lowest price from Pricewatch. Other pricing engines exist, but Pricewatch offers a convenient way to see all the current CPU prices on one page. Note that these are likely to be OEM CPUs, but for our purposes, it's a level playing field. Now that we've seen the list of contenders, let's discuss testing methodology.
Component AMD system Intel system Motherboard ASUS A8N-SLI Deluxe ASUS P5WD2 Premium Chipset Nvidia Nforce4 SLI (rel 6.66 chipset drivers) Intel 955X (latest Intel drivers) Memory 2 x 512MB Corsair XMS 3200XL (CAS 2-2-2-5) 2 x 512MB Corsair XMS2 Pro (CAS 3-3-3-8) Graphics card Nvidia GeForce 6800GT Reference (78.01 drivers) Nvidia GeForce 6800GT Reference (78.01 drivers) Hard drive Seagate 7200.8 160GB Seagate 7200.8 160GB Optical storage Plextor 16x DVD +/- RW Plextor 16x DVD +/-RW Sound card Sound Blaster Audigy 2 Sound Blaster Audigy 2 Display Dell 2001FP (LCD flat panel, 1600x1200) Dell 2001FP (LCD flat panel, 1600x1200) Operating system Windows XP Professional SP2, all updates installed Windows XP Professional SP2, all updates installed
Each system had a clean install of Windows XP Professional with Service Pack 2 and all updates installed. The hard drives were defragged prior to each benchmark session. Before we ran each benchmark, we executed the command rundll32.exe advapi32.dll,ProcessIdleTasks. This immediately executes all background idle tasks to completion, including tasks such as the Windows prefetcher.
The Benchmarks
We ran a mix of purely synthetic, applications-based synthetic, and actual applications tests in order to gauge performance.We used the following benchmarks and applications in our testing:
- BAPCo's SYSmark 2004SE. Based on real applications, this benchmark gives us a taste of how real applications might run on the system. Note that rather than reporting an overall score, we break it down the individual Internet Content Creation and Office Productivity scores.
- POV-RAY 3.70 beta 0.9 (http://www.povray.org). The POV-RAY freeware software rendering application has been around since the days of MS-DOS. The current 3.70 version for Windows is still a work in progress, but it does support multithreading.
- 3ds Max 7 (http://usa.autodesk.com/adsk/servlet/index?id=5659302&siteID=123112). 3ds Max 7 is Autodesk's popular 3D modeling and animation tool. Note that we used both the SPECapc 3ds Max test http://www.spec.org/gpc/apc.static/max7info.html) and a couple of pure rendering tests. The SPECapc test for 3ds Max 7 is much enhanced over the older test supporting version 6, and takes longer.
- Newtek LightWave 8 (http://www.newtek.com). This is a popular 3D modeling and animation tool used primarily for special effects in a number of television shows, plus a variety of other applications. We used the current "Radiosity-Box" rendering benchmark.
- Adobe After Effects 6.0, which ran a variety of filters using a fixed script on identical content to generate a final animation.
- Windows Media Encoder. We also installed Windows Media Player 10 so that we could encode a fixed 330MB video clip using Windows Media Encoder Advanced Profile. The encoder was configured to encode the video as streaming media.
- Cyberlink's PowerEncoder 1.0, which we used to encode the same 330MB clip into H.264 format, which will be the format likely to be most common in future DVD applications.
- 3DMark 2005 and PCMark05. While these are synthetic benchmarks, they can yield insights into individual subsystems within the PC.
- Five 32-bit games, including Far Cry, Flight Simulator 2004, Painkiller, Doom 3 and Splinter Cell: Chaos Theory.
One note about our use of SYSmark 2004SE. We are not reporting the overall SYSmark score, nor are we using the SYSmark 2004SE Internet Content Creation score in our results. That's because the overall score and the ICC score tend to be heavily weighted by the 3ds Max 5.1 rendering performance. We do use the Office Productivity Score, which is determined by a balanced mix of scripts running on standard office applications.
If you who want to see the actual performance results, we'll place all those charts in an appendix, which follows the conclusion section of this article.
Office Applications
Here, we simply used the result from the SYSmark 2004SE Office Productivity score. This test executes a set of scripts using commonly available office applications, including Microsoft Office (Word, Excel, PowerPoint, Access), Adobe Photoshop, McAfee VirusScan and others.
The final score was simply the SYSmark 2004SE Office Productivity Score divided by the CPU price. In this case, the best numbers are larger; we believe that the simple "bigger is better" gets the nod here.
3D Modeling and Rendering
We used two different methodologies here. The first was to run the SPECapc 3ds Max 7 benchmark. This test was designed by actual 3ds Max users, and generates a result based on both interactive performance and rendering performance, but tends to be weighted more towards interactivity. Note that we set up 3ds Max 7 to run using the Direct3D driver, not software.
We also ran a set of rendering tests using 3ds Max 7, LightWave 8.0 and POV-Ray 3.70 beta 0.9. We used the late beta of POV-Ray because of its multithreading support. In the future, we'll use the full release version. The 3ds Max 7 rendering test were completed with SSE enabled, while LightWave 8.0 rendering was run with two threads (single-core CPUs) or four threads (dual-core CPUs).
We generated two separate price/performance scores—one based on the SPECapc 3ds Max 7 benchmark result and the other using the geometric mean of the different rendering times. Since the smaller number from the raw benchmark result indicates better performance, we divided dollars by time to get the end result. The final result is that larger numbers indicate better price/performance ratios.
Media Encoding and Filters
The Adobe After Effects, WME 9 and H.264 encoding tests are all reported in time units (seconds). So we took the geometric mean of the results and divided dollars by the final sum—larger is better.
3DMark05 CPU Test and PCMark05 CPU Test
While these tests are purely synthetic, they are forward-looking benchmarks, and the result generated is based on the frames per second of two separate tests (3DMark05) or the CPU test score (PCMark05). We divide the final result by dollars. The larger number is better in this result. Here, we pick the biggest number—since this is a purely synthetic test.
Games Benchmarks
We wanted to isolate the CPU component of the game tests, so we focused on the low-resolution scores. (We do report the raw results of the benchmarks at higher resolutions in the appendix). Running game tests at low resolutions and low detail minimizes the impact of the graphics card. At higher resolutions, the graphics hardware can mask the impact of the CPU. For example, at 1280x1024 resolution and high detail settings, all the Splinter Cell: Chaos Theory results were identical, because the game is bound by the pixel shader performance of the graphics card we used.
We then took the geometric mean of the five low-resolution scores for each CPU. (The geometric mean tends to minimize wide variances in score.) We then divided the geometric mean of the frame rate by the price of the processor to get a frame rate per-dollar score.
Now let's take a look at our findings.
What's interesting here is that a slightly more-expensive CPU—the Intel Pentium 4 630—offers a somewhat better price/performance ratio than the lower-price Athlon 64 3000+ (101 SYSmarks per dollar for the P4 630 versus 97 SYSmarks per dollar for the Athlon 64 3000+.) The same thing holds true on the dual-core front, with the Pentium D 820 generating 67.2 SYSmarks per dollar versus 48.7 SYSmarks per dollar for the Athlon 64 X2 3800+.
It's also no surprise that the most expensive processors yielded the worst performance per dollar, with the "champion" underperformer being the Athlon 64 FX-57.
Remember, bigger is better here. Differences in SPECapc 3ds Max scores were relatively minimal, yet prices varied a lot over the product lines. Perhaps most interesting is the Athlon 64 3700+ and 3800+ stacked next to each other, generating similar price/performance scores. Of the dual-core processors, Intel seems to fare better here.
In an absolute sense, productivity counts, and there may be valid reasons for going with higher-cost CPUs. But it's likely that the best overall solution lies somewhere in the middle.
Even more so than interactive performance, time is money when it comes to rendering. On the other hand, most software rendering engines can take advantage of
We actually expected the dual-core processors to perform better here, since all these rendering engines are multithreaded. Certainly the Pentium D 820 and 830 did okay, but not as well in dollar per unit of rendering time as some of the lower-cost units.
Again, the cheapest processors yield the best ratios, but time is money, and spending a bit more may yield better overall productivity, depending on your needs.
What's interesting about both of these tests is how odd the Athlon 64 3700+ results appear. This is similar to the results we've seen in other, more real-world tests.
The real sweet spot here looks to be the Athlon 64 3800+. While lower-cost processors will give you a better frame rate-per-dollar ratio, some of the games tend to get a bit chunky in some titles—Splinter Cell: Chaos Theory, for example.
On the Intel front, the Pentium 4 660 does seem to have a slightly better position on most curves than the other Intel CPUs. At under $400 for 3.60GHz, it's certainly better positioned than the pricier 670. Note that the weakest link in Intel's lineup is the very pricey 3.73GHz Pentium 4 Extreme Edition.
Dual-core processors didn't always fare well. To be fair, these tests tend to focus on single applications running at a time (even though some are multi-threaded). We're not attempting to measure perceived responsiveness, and that may be the dual-core processor's real strength. Of course, part of the reason is that the dual-core CPUs tend to be priced higher than their single-core clock-rate equivalents. Certainly the die sizes are as much to blame as any premium pricing by the CPU companies: A bigger die means higher costs, and so they need to be priced higher. If you compare them with the cost of two single cores, then the picture looks a bit brighter. But it's also clear that application development for multicore is still relatively immature, and we'll need to wait awhile to see more significant benefits.