Hacker News Comments on
Inner Loops: A Sourcebook for Fast 32-Bit Software Design
·
3
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this book.Old skool book recommendation. Not necessarily C, more ASM but still a good book for the early nineties optimization that still has a mentality that's still valid. I believe that people today would be more concerned about how fast their code runs if it was visible, that is, if they profiled it with good tools!https://www.amazon.com/Inner-Loops-Sourcebook-Software-Devel...
People interested in this might also enjoy 'Inner Loops' (1997), an excellent book that covers optimizing x86 at that time but is instructive in general and the concepts are still applicable today.https://www.amazon.com/Inner-Loops-Sourcebook-Software-Devel...
If you love this kinda stuff, Inner Loops [0] is an excellent book that covers low level performance optimization. And the author Rick Booth handles it in a way that allows the reader to transfer the techniques to new platforms.I like it so much I buy used copies and give them as gifts.
[0] http://www.amazon.com/Inner-Loops-Sourcebook-Software-Develo...
⬐ agumonkeyThanks for the suggestion. In the same domain, I liked articles about https://www.google.fr/search?q=mechanical+sympathy⬐ nkurzI've appreciated your comments elsewhere, but I'm really dubious that a book that only covers the Pentium II as an addendum can offer any specific advice that is still useful. You're sure? I'm intrigued enough to try to get a copy from the library.I did just read a great explanation of how the P6 generation the Pentium II belongs to differed from the previous one in that it was the first Intel to have Out of Order execution: http://people.cs.clemson.edu/~mark/330/colwell/p6des.pdf
I'd love to find a more up to date book on such topics. Currently I'm struggling to understand how speculative execution works (or doesn't work) in cases that are bound by the front end's constraint of issuing only 4 instructions per cycle. Do you give up all preloading when you are front end bound? It would seem like the PC and speculative PC would be running together, and you'd always bear the full brunt of latency.
⬐ sitkackI would be surprised if you weren't pleasantly surprised.I can't address your SE question, I am not caught up on current tech. The TSX[0] stuff looks really fun, but it is off for now.
You might enjoy realworldtech [1] for hardware info.
[0] http://en.wikipedia.org/wiki/Transactional_Synchronization_E...
[1] http://www.realworldtech.com/
PS Just ran across this while weeding for multiple issue architectures, http://mcg.cs.tau.ac.il/papers/