HN Books @HNBooksMonth

The best books of Hacker News.

Hacker News Comments on
High Performance MySQL: Optimization, Backups, and Replication

Baron Schwartz, Peter Zaitsev, Vadim Tkachenko · 4 HN comments
HN Books has aggregated all Hacker News stories and comments that mention "High Performance MySQL: Optimization, Backups, and Replication" by Baron Schwartz, Peter Zaitsev, Vadim Tkachenko.
View on Amazon [↗]
HN Books may receive an affiliate commission when you make purchases on sites after clicking through links on this page.
Amazon Summary
How can you bring out MySQL’s full power? With High Performance MySQL, you’ll learn advanced techniques for everything from designing schemas, indexes, and queries to tuning your MySQL server, operating system, and hardware to their fullest potential. This guide also teaches you safe and practical ways to scale applications through replication, load balancing, high availability, and failover. Updated to reflect recent advances in MySQL and InnoDB performance, features, and tools, this third edition not only offers specific examples of how MySQL works, it also teaches you why this system works as it does, with illustrative stories and case studies that demonstrate MySQL’s principles in action. With this book, you’ll learn how to think in MySQL. Learn the effects of new features in MySQL 5.5, including stored procedures, partitioned databases, triggers, and views Implement improvements in replication, high availability, and clustering Achieve high performance when running MySQL in the cloud Optimize advanced querying features, such as full-text searches Take advantage of modern multi-core CPUs and solid-state disks Explore backup and recovery strategies—including new tools for hot online backups
HN Books Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this book.
I don't have any one complete book that I can recommend, and I don't even really have a great reading list for this. But I'll make an attempt to share what I think is useful as a starting point.

1. Systems Operations is first and foremost about understanding systems, in all of their complexity, which means understanding the internals of your OS primarily.

2. Performance and networking, in particular, are super important areas to focus on understanding when it comes to learning the topic to help with software development.

3. A lot of it is about understanding concepts in abstract and being able to extrapolate to other situations and apply these concepts, so there's actually quite a lot of useful information that can be learned on one OS and still applied to another OS (or on one game engine and applied to another, et al).

Here's a few books I think are worth reading, not in any particular order of prevalence, but loosely categorized

Databases:

High Performance MySQL: https://www.amazon.com/gp/product/1449314287/

SQL Queries for Mere Mortals: https://www.amazon.com/gp/product/0321992474/

The Art of SQL: https://www.amazon.com/gp/product/0596008945/

Networking:

TCP/IP Illustrated: https://www.amazon.com/exec/obidos/ISBN=0201633469/wrichards... (updates on author's site at http://www.kohala.com/start/tcpipiv1.html)

The TCP/IP Guide: https://www.amazon.com/TCP-Guide-Comprehensive-Illustrated-P...

UNIX Network Programming: https://www.amazon.com/dp/0131411551

Beej's Guide to Network Programming: http://beej.us/guide/bgnet/

Operating Systems:

Operating Systems Concepts: https://www.amazon.com/Operating-System-Concepts-Abraham-Sil... (various editions, I have the 7th edition... I recommend you find the latest)

Modern Operating Systems: https://www.amazon.com/Modern-Operating-Systems-Andrew-Tanen... (the "Tanenbaum Book")

Operating Systems Design and Implementation: https://www.amazon.com/Operating-Systems-Design-Implementat-... (the other one, the "MINIX Book")

Windows Internals:

Part 1: https://www.amazon.com/Windows-Internals-Part-architecture-m...

Part 2: https://www.amazon.com/Windows-Internals-Part-2-7th/dp/01354... (I had the pleasure of being taught from this book by Mark Russinovich and David Solomon at a previous employer, was an amazing class and these books are incredible resources even applied outside of Windows, we used 5th edition, I linked 7th, which has the 2nd part pending publication).

MacOS Internals:

Part 1: https://www.amazon.com/MacOS-iOS-Internals-User-Mode/dp/0991...

Part 2: https://www.amazon.com/MacOS-iOS-Internals-II-Kernel/dp/0991...

Part 3: https://www.amazon.com/MacOS-iOS-Internals-III-Insecurity/dp...

Linux Kernel Programming:

Part 1: https://www.amazon.com/Linux-Kernel-Development-Cookbook-pro...

Part 2: https://www.amazon.com/Linux-Kernel-Programming-Part-Synchro...

The Linux Programming Interface: https://www.amazon.com/Linux-Programming-Interface-System-Ha...

General Systems Administration:

Essential Systems Administration: https://www.amazon.com/gp/product/0596003439/

UNIX and Linux Systems Administration Handbook: https://www.amazon.com/UNIX-Linux-System-Administration-Hand...

The Linux Command Line and Shell Scripting Bible: https://www.amazon.com/Linux-Command-Shell-Scripting-Bible/d...

UNIX Shell Programming: https://www.amazon.com/Unix-Shell-Programming-Stephen-Kochan...

BASH Hackers Wiki: https://wiki.bash-hackers.org/

TLDP Advanced BASH Scripting Guide: https://tldp.org/LDP/abs/html/

The Debian Administrator's Handbook: https://debian-handbook.info/browse/stable/

TLDP Linux System Administrator's Guide: https://tldp.org/LDP/sag/html/index.html

Performance & Benchmarking:

Systems Performance: https://www.amazon.com/Systems-Performance-Brendan-Gregg-dp-... (this is Brendan Gregg's book where you learn about the magic of dtrace)

BPF Performance Tools: https://www.amazon.com/Performance-Tools-Addison-Wesley-Prof... (the newer Brendan Gregg book about BPF, stellar)

The Art of Computer Systems Performance Analysis: https://www.cse.wustl.edu/~jain/books/perfbook.htm (no longer available from Amazon, but is available direct from publisher. This is basically the one book you should read about creating and structuring benchmarks or performance tests)

I guess that's a "reading list", but this is just a small part of what you need to know to excel in systems operations.

I would say for the typical software developer writing web applications, the most important thing to know is how databases work and how networking works, since these are going to be the primary items affecting your application performance. But there's obviously topics not included in this list that are also worth understanding, such as browser/DOM internals, how caching and CDNs work, and web-specific optimizations that can be achievable with HTTP/2 or QUIC.

For the average software developer writing desktop applications, I'd say make sure you /really/ understand OS internals... at the base everything you do on a computer system is based on what the OS provides to you. Even though you are abstracted (possibly many layers) away from this, being able to peel back the layers and understand what's /really/ happening is essential to writing high-quality application code that is performant and secure, as well as making you a champ at debugging issues.

If you're trying to get into systems operations as a field, this is just a brush over the top surface and there's a lot deeper diving required.

planet-and-halo
Appreciate the recs. Thanks very much for taking the time to write a detailed response.
You should try to understand how databases in general work, it will help you with your query writing.

One thing you have to realize is that once you get a little advanced, you have to get to the details of the single SQL implementations, it's not about SQL but about Postgres.

I've found these books really valuable

# SQL Performance Explained Everything Developers Need to Know about SQL Performance

https://www.amazon.com/Performance-Explained-Everything-Deve...

This book fundamentally talks about how to effectively use and leverage the SQL indices. Talks about all the important implementations (Postgres, MySQL, Oracle, SQL Server).

# Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

https://www.amazon.com/Designing-Data-Intensive-Applications...

This book gets mentioned a bunch around here and for a good reason. There aren't too many concrete resources on making your systems "webscale" and this one is really good.

# PostgreSQL 9.0 High Performance

https://www.amazon.com/PostgreSQL-High-Performance-Gregory-S...

Discusses all the different settings and tweaks you can do in Postgres. It's crazy how much of a perf gain you can get just by twiddling the parameters of the database, i.e. all the tricks you can do when the single instances are bottle necks.

There's a similar book for MySQL https://www.amazon.com/High-Performance-MySQL-Optimization-R...

# PostgreSQL 9 High Availability Cookbook

https://www.amazon.com/PostgreSQL-9-High-Availability-Cookbo...

Discusses how do you go from 1 Postgres instance to 1+ instance. Talks about replication, monitoring, cluster management, avoiding downtime etc i.e. all the tricks you can do to manage multiple instances. Again there's a similar book for MySQL https://www.amazon.com/MySQL-High-Availability-Building-Cent...

Last but not least check out the postgres documentation, people consider it a standard of what good documentation looks like https://www.postgresql.org/docs/9.6/static/index.html

Also last but not least, read up on relational algebra (the foundation of SQL) https://en.wikipedia.org/wiki/Relational_algebra. I've always found SQL to be extremely verbose (the syntax reminds me of idk COBOL or smth) but there's another query language called Datalog, that's for our purposes similar to SQL but the syntax is much more legible.

E.g. check out these snippets from these slides (page 29) (and check out the whole class too)

https://pages.iai.uni-bonn.de/manthey_rainer/IIS_1617/IIS201...

Datalog:

s(X) <- p(X,Y).

s(X) <- r(Y,X).

t(X,Y,Z) <- p(X,Y), r(Y,Z).

w(X) <- s(X), not q(X).

SQL:

CREATE VIEW s AS (SELECT a FROM p)

UNION

(SELECT b FROM r);

CREATE VIEW t AS

SELECT a, b, c

FROM p, r

WHERE p.b = r.a,

CREATE VIEW w AS (TABLE s)

MINUS (TABLE q);

If your really serious about enhancing your SQL skills the following should take you from beginner to expert with time. Udemy and other online resources are nice but they will not take you to your maximum potential, only the books and official documentation will as that is where the experts get there information. The videos and tutorials are good for overviews and helping with certain scenarios but not the best resources if you want a professional high end learning path to the top.

If you want working with SQL to become extremely easy to you, you will need to start creating websites, bring traffic to those sites and implement the scaling procedures you have learned from the books below.

As if you go to a job interview and want to blow it out of the water it is to your best advantage to know for example MySQL in and out so well you make the interviewer smile inside and go yes we have found the one. You will probably see a little smirk as the interviewer is trying to hold in their smile when this happens but you will know when this occurs very easily. Upside to this is you will be very comfortable getting started on day one and the only training you will need is the current architecture and backup or non existent backup plans in place so you can get to work.

I recommend the following hardcopy books in order:

MySQL: http://www.amazon.com/MySQL-Developers-Library-Paul-DuBois-e...

www.amazon.com/High-Performance-MySQL-Optimization-Replication/dp/1449314287/

http://www.amazon.com/MySQL-High-Availability-Building-Cente...

If your wanting to do Microsoft SQL Server, I recommend the following: http://www.amazon.com/Microsoft-Server-2012-Step-Developer/d...

http://www.amazon.com/Training-70-461-Querying-Microsoft-Ser...

http://www.amazon.com/Training-70-462-Administering-Microsof...

comatory
cool thanks. should I start with the first one? i just need to emphasize that I'm still fairly beginner so I want to be sure that I can understand it and go gradually from easy to hard.
techjuice
Yes the first MySQL book will take you a long way and lay most of it out for you very nicely. The other books add on to the foundation that is given to you in the first book.

Part One gives a really good overview of writing queries for mysql, optimizing them, understanding all the data types etc.

Part Two gives you the practical hand on that you need in multiple programming languages C, Perl, and PHP.

Part Three teaches you administration and security which helps complete the circle of knowledge. As you don't want to be a developer that does not know how to talk the Database administrator language. As the more you know and learn the better you and others can help optimize and secure the different development, staging and production environments applications and hardware.

Sure - here's a list of what's good to know for MySQL. Other DBs are going to have different needs, though the indexing data is good to know regardless.

Start with the book High Performance MySQL. [1]

Follow up with the whitepaper "Causes of Downtime". [2]

Then find a copy of the IMDB dataset, put that in a database, and write an app against it. Make that app perform well, then simulate load against the app (pretend it hit the top of Reddit and Hacker News simultaneously), and keep it performing well.

After that, it's a matter of practical practice.

[1] http://www.amazon.com/High-Performance-MySQL-Optimization-Re...

[2] http://www.percona.com/redir/files/white-papers/causes-of-do...

HN Books is an independent project and is not operated by Y Combinator or Amazon.com.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.