Hacker News Comments on
Computer Architecture: A Quantitative Approach, 3rd Edition
Hacker News Stories and CommentsAll the comments and stories posted to Hacker News that reference this book.
This is BS in terms of engineering, but it says a lot about the problem Intel has to solve to stay in business.
According to Hennessy & Patterson, 90% of performance gains come from better architecture. That's the tock. The other 10% came from clock speed , which was a side effect of better fab, the tick. But you mostly want better fab to get more transistors so you can build a better architecture.
So here's Intel's problem. They sink a huge amount of money (costs also follow Moore's Law) to upgrade their fabs before they release a new chip. Then they have nothing new to sell for a year or more while they fit it to the fab.
In the nineties they smoothed out demand by selling up-clocked versions of old chips, training consumers to think that more MHz = more faster. That's less effective now that clock speeds have stabilized, so they've been pitching lower power consumption instead. It's not really the same, because "faster CPU" in the 1990s really meant "it can run new software."
It will be interesting to see how this model will fare in the cloud era. If most CPUs live in data centers, most purchasers will choose based on power consumption and pay less attention to architecture.
 Until clock speed stopped changing. Clock speeds may even drop for a while to make parallel engineering easier. cf. http://www.amazon.com/Computer-Architecture-Quantitative-App...
⬐ tophercyllI'm not sure why you say it's "BS in terms of engineering."
The tick/tock is done for both business and engineering reasons. Debugging new silicon is hard enough without landing a new microarchitecture and a new process simultaneously.
While the ticks (new process) aren't quite as exciting as the tocks (new architecture), they bring real benefits in cost, power, and increased perf, not to mention usually a few microarchitectural enhancements. These chips are the best of their uarch, and they pave the way for the next tock, as you point out.
Disclaimer: I used to work for Intel's Oregon CPU Architecture Team in the performance group, but that was more than three years ago, so take what I say with a grain of salt. And, of course, I don't speak for Intel in any way.⬐ mnemonicsloth⬐ skmurphyIt's good engineering. Obviously right.
The name is BS. Making one change at a time is universal engineering practice. Who names that?
Marketers. Or "maybe Tick Tock" originated as a slogan for management. Which would've been a pretty cool hack, come to think about it. Keep the MBAs away from your functioning process by giving it a cool name and telling them it's a company secret.It's a reasonable strategy in that it avoids the simultaneous debugging of a new architecture on a new process. This allows them to isolate process related issues and get a stable definition using a known good architecture. They are in effect selling their test chip or "pipe cleaner" and preparing for a jump to a new architecture.⬐ newfolder09As a general rule of thumb Performance for a given program for any CPU is : frequency times Instructions completed/cycle times number of instructions . The number of instructions is sort of fixed based on your ISA (ie. whether the machine is CISC like the x86 or RISC-like like the MIPS/ARM). What happened in the Pentium4 era, was that the focus was almost completely on the frequency part of the equation rather than the Instructions completed per cycle.
Intel's focus on new fabs is not just for a higher clock speed- that's a useful side benefit. The real reason is significantly lower cost/die. The same wafer can now produce many more cpu dies (that are slightly faster), increasing their profit/unit.
With regard to data centers, we are already seeing a move to power efficient architectures (with the Core family of cpus) versus pure performance. However, especially in the data center model, performance is still a critical metric that probably is not going away any time soon.⬐ mnemonicslothDatacenter operators pay attention to cost of ownership per increment of performance. They'll be very interested in faster chips but only when they can run an extra virtual instance on each server without raising power and cooling costs, for example.
On clock speeds you're mistaken. Yes, Intel went from 20 to 1000 MHz during the nineties, and their marketing was all about clocks. But during that period, they also added 20-stage pipelines, out-of-order execution, 3 levels of caching for instructions and data, branch prediction, and hyperthreading. That's the substance of the Hennesy and Patterson claim: during the 10-year up-clocking binge, 90% of performance gains still came from architecture.
I'm not sure I understand what you mean about cost per die. Can you elaborate?⬐ skmurphyI think cost per die relates to wafer size, moving to 300mm wafers gave them not only many more total die but many more good die per wafer. Given that they were going to completely change out the fab equipment to support the new wafer size it made sense to move to a new process at the same time.⬐ newfolder09So it wasn't clear to me initially which phase of cpu development you meant. I agree with your statement regarding cpu development in the 90s. Looking at the performance equation I gave earlier, the second portion has gone from several 10s of cycles/instruction in the 486 timeframe to 2-3 INSTRUCTIONS/cycle for the Pentium III.
My point was that your statement about 90% of the gains being from Instructions/clock (ie. architecture) was not always true. The Pentium 4 being a prime candidate, where the number of pipestages was dramatically scaled up (reducing instructions/clock) to increase frequency.
WRT cost/die: Cpus are created on circular silicon wafers. http://arstechnica.com/hardware/news/2008/09/moore.ars/2 Every move to a lower process node, reduces the area for each cpu die. For a given cpu, this means that more of them can be added to each wafer, driving down the cost for each unit.
Ofcourse, as you mentioned, by keeping die size a constant , they get get more transistors/die, allowing them to cram more features on a chip. Lowering costs v/s adding features is a tradeoff that every cpu design team has to make.