HN Books @HNBooksMonth

The best books of Hacker News.

Hacker News Comments on
Computer Architecture: A Quantitative Approach, 3rd Edition

John L. Hennessy, David A. Patterson · 2 HN comments
HN Books has aggregated all Hacker News stories and comments that mention "Computer Architecture: A Quantitative Approach, 3rd Edition" by John L. Hennessy, David A. Patterson.
View on Amazon [↗]
HN Books may receive an affiliate commission when you make purchases on sites after clicking through links on this page.
Amazon Summary
This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today. In this edition, the authors bring their trademark method of quantitative analysis not only to high performance desktop machine design, but also to the design of embedded and server systems. They have illustrated their principles with designs from all three of these domains, including examples from consumer electronics, multimedia and web technologies, and high performance computing. The book retains its highly rated features: Fallacies and Pitfalls, which share the hard-won lessons of real designers; Historical Perspectives, which provide a deeper look at computer design history; Putting it all Together, which present a design example that illustrates the principles of the chapter; Worked Examples, which challenge the reader to apply the concepts, theories and methods in smaller scale problems; and Cross-Cutting Issues, which show how the ideas covered in one chapter interact with those presented in others. In addition, a new feature, Another View, presents brief design examples in one of the three domains other than the one chosen for Putting It All Together. The authors present a new organization of the material as well, reducing the overlap with their other text, Computer Organization and Design: A Hardware/Software Approach 2/e, and offering more in-depth treatment of advanced topics in multithreading, instruction level parallelism, VLIW architectures, memory hierarchies, storage devices and network technologies. Also new to this edition, is the adoption of the MIPS 64 as the instruction set architecture. In addition to several online appendixes, two new appendixes will be printed in the book: one contains a complete review of the basic concepts of pipelining, the other provides solutions a selection of the exercises. Both will be invaluable to the student or professional learning on her own or in the classroom. Hennessy and Patterson continue to focus on fundamental techniques for designing real machines and for maximizing their cost/performance. * Presents state-of-the-art design examples including: * IA-64 architecture and its first implementation, the Itanium * Pipeline designs for Pentium III and Pentium IV * The cluster that runs the Google search engine * EMC storage systems and their performance * Sony Playstation 2 * Infiniband, a new storage area and system area network * SunFire 6800 multiprocessor server and its processor the UltraSPARC III * Trimedia TM32 media processor and the Transmeta Crusoe processor * Examines quantitative performance analysis in the commercial server market and the embedded market, as well as the traditional desktop market. Updates all the examples and figures with the most recent benchmarks, such as SPEC 2000. * Expands coverage of instruction sets to include descriptions of digital signal processors, media processors, and multimedia extensions to desktop processors. * Analyzes capacity, cost, and performance of disks over two decades. Surveys the role of clusters in scientific computing and commercial computing. * Presents a survey, taxonomy, and the benchmarks of errors and failures in computer systems. * Presents detailed descriptions of the design of storage systems and of clusters. * Surveys memory hierarchies in modern microprocessors and the key parameters of modern disks. * Presents a glossary of networking terms.
HN Books Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this book.
This is BS in terms of engineering, but it says a lot about the problem Intel has to solve to stay in business.

According to Hennessy & Patterson, 90% of performance gains come from better architecture. That's the tock. The other 10% came from clock speed [1], which was a side effect of better fab, the tick. But you mostly want better fab to get more transistors so you can build a better architecture.

So here's Intel's problem. They sink a huge amount of money (costs also follow Moore's Law) to upgrade their fabs before they release a new chip. Then they have nothing new to sell for a year or more while they fit it to the fab.

In the nineties they smoothed out demand by selling up-clocked versions of old chips, training consumers to think that more MHz = more faster. That's less effective now that clock speeds have stabilized, so they've been pitching lower power consumption instead. It's not really the same, because "faster CPU" in the 1990s really meant "it can run new software."

It will be interesting to see how this model will fare in the cloud era. If most CPUs live in data centers, most purchasers will choose based on power consumption and pay less attention to architecture.

[1] Until clock speed stopped changing. Clock speeds may even drop for a while to make parallel engineering easier. cf.

I'm not sure why you say it's "BS in terms of engineering."

The tick/tock is done for both business and engineering reasons. Debugging new silicon is hard enough without landing a new microarchitecture and a new process simultaneously.

While the ticks (new process) aren't quite as exciting as the tocks (new architecture), they bring real benefits in cost, power, and increased perf, not to mention usually a few microarchitectural enhancements. These chips are the best of their uarch, and they pave the way for the next tock, as you point out.

Disclaimer: I used to work for Intel's Oregon CPU Architecture Team in the performance group, but that was more than three years ago, so take what I say with a grain of salt. And, of course, I don't speak for Intel in any way.

It's good engineering. Obviously right.

The name is BS. Making one change at a time is universal engineering practice. Who names that?

Marketers. Or "maybe Tick Tock" originated as a slogan for management. Which would've been a pretty cool hack, come to think about it. Keep the MBAs away from your functioning process by giving it a cool name and telling them it's a company secret.

It's a reasonable strategy in that it avoids the simultaneous debugging of a new architecture on a new process. This allows them to isolate process related issues and get a stable definition using a known good architecture. They are in effect selling their test chip or "pipe cleaner" and preparing for a jump to a new architecture.
As a general rule of thumb Performance for a given program for any CPU is : frequency times Instructions completed/cycle times number of instructions . The number of instructions is sort of fixed based on your ISA (ie. whether the machine is CISC like the x86 or RISC-like like the MIPS/ARM). What happened in the Pentium4 era, was that the focus was almost completely on the frequency part of the equation rather than the Instructions completed per cycle.

Intel's focus on new fabs is not just for a higher clock speed- that's a useful side benefit. The real reason is significantly lower cost/die. The same wafer can now produce many more cpu dies (that are slightly faster), increasing their profit/unit.

With regard to data centers, we are already seeing a move to power efficient architectures (with the Core family of cpus) versus pure performance. However, especially in the data center model, performance is still a critical metric that probably is not going away any time soon.

Datacenter operators pay attention to cost of ownership per increment of performance. They'll be very interested in faster chips but only when they can run an extra virtual instance on each server without raising power and cooling costs, for example.

On clock speeds you're mistaken. Yes, Intel went from 20 to 1000 MHz during the nineties, and their marketing was all about clocks. But during that period, they also added 20-stage pipelines, out-of-order execution, 3 levels of caching for instructions and data, branch prediction, and hyperthreading. That's the substance of the Hennesy and Patterson claim: during the 10-year up-clocking binge, 90% of performance gains still came from architecture.

I'm not sure I understand what you mean about cost per die. Can you elaborate?

I think cost per die relates to wafer size, moving to 300mm wafers gave them not only many more total die but many more good die per wafer. Given that they were going to completely change out the fab equipment to support the new wafer size it made sense to move to a new process at the same time.
So it wasn't clear to me initially which phase of cpu development you meant. I agree with your statement regarding cpu development in the 90s. Looking at the performance equation I gave earlier, the second portion has gone from several 10s of cycles/instruction in the 486 timeframe to 2-3 INSTRUCTIONS/cycle for the Pentium III.

My point was that your statement about 90% of the gains being from Instructions/clock (ie. architecture) was not always true. The Pentium 4 being a prime candidate, where the number of pipestages was dramatically scaled up (reducing instructions/clock) to increase frequency.

WRT cost/die: Cpus are created on circular silicon wafers. Every move to a lower process node, reduces the area for each cpu die. For a given cpu, this means that more of them can be added to each wafer, driving down the cost for each unit.

Ofcourse, as you mentioned, by keeping die size a constant , they get get more transistors/die, allowing them to cram more features on a chip. Lowering costs v/s adding features is a tradeoff that every cpu design team has to make.

HN Books is an independent project and is not operated by Y Combinator or
~ [email protected]
;laksdfhjdhksalkfj more things ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.