HN Books @HNBooksMonth

The best books of Hacker News.

Hacker News Comments on
The Visual Display of Quantitative Information

Tufte, Edward R. · 12 HN comments
HN Books has aggregated all Hacker News stories and comments that mention "The Visual Display of Quantitative Information" by Tufte, Edward R..
View on Amazon [↗]
HN Books may receive an affiliate commission when you make purchases on sites after clicking through links on this page.
Amazon Summary
The classic book on statistical graphics, charts, tables. Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst) statistical graphics, with detailed analysis of how to display data for precise, effective, quick analysis. Design of the high-resolution displays, small multiples. Editing and improving graphics. The data-ink ratio. Time-series, relational graphics, data maps, multivariate designs. Detection of graphical deception: design variation vs. data variation. Sources of deception. Aesthetics and data graphical displays. This is the second edition of The Visual Display of Quantitative Information. Recently published, this new edition provides excellent color reproductions of the many graphics of William Playfair, adds color to other images, and includes all the changes and corrections accumulated during 17 printings of the first edition.
HN Books Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this book.
Since nobody has mentioned it yet, for charts and graphs there's The Visual Display of Quantitative Information by Edward R. Tufte: https://amazon.com/Visual-Display-Quantitative-Information/d...
Dec 31, 2020 · doomlaser on My Favorite Books 2020
Books I bought that I've been enjoying this year:

Favorite Folktales from Around the World, by Jane Yolen [0]: Excellent short folk stories from many different regions. Great for short bursts of story, sometimes with interesting wisdom. As a game developer, some of them have been inspiring for hooks to maybe use in future projects

The Visual Display of Quantitative Information, by Tufte [1]: Very fun to peruse the graphs and charts. The visualization of Napolean's army marching into Russian winter and back is a classic. I bought this because I was figuring out a novel UI for a video game about birds

Graphic Design: A New History, by Eskilson [2]: Goes over the history of graphic design and printing technology from pre-Gutenberg to the present day. I bought this because it was written by my college Art History professor. His class was my favorite in all of undergrad, and I wanted to experience more history delivered in his style.

[0] https://www.amazon.com/gp/product/0394751884/

[1] https://www.amazon.com/gp/product/0961392142/

[2] https://www.amazon.com/gp/product/0300233280/

gen220
All of Tufte's books are fantastic. They are not very practical if you are learning InfoViz (beyond being pretty pictures!). Many of the visualizations are form-fitted to the information they're displaying.

If you're familiar with basic theory (the visual dimensions, visual psychology, etc), his books are even more mind-blowing. He's uniquely gifted at curating and critiquing. And the exemplary pieces he's curated are shockingly-creative.

Sep 24, 2017 · hsitz on Ask HN: Coffee table books
I think any of the Edward Tufte books qualify, starting with _The Visual Display of Quantitative Information_:

https://www.amazon.com/Visual-Display-Quantitative-Informati...

If you look at

https://www.amazon.com/Visual-Display-Quantitative-Informati...

you see that world class infographics require a "data artist", that is a lot of curation. You have to set the parameters of the chart quite carefully to produce a graphic which appears insightful. It is very easy to change those parameters a little and wind up in a bad place, where you might overload the tools with too much data, etc.

Data viz tools that are easy for a non "data artist" to use are an active area for CS research, new software products, etc.

Another problem is that data is less interesting than it seems to be at first. What action are you going to take on it? Data analysis delivers value when it informs actions.

The history of international development efforts is that aid agencies often have little understanding about conditions on the ground. For instance, they would send big tractors to Chile and truck them at great trouble and expense to get them into farming villages that could not use them, fuel them, fix them, etc.

Thus reducing a country down to a few numbers could cause the illusion that you know something when you really don't.

The numbers themselves are suspect. In a place like Rwanda, few people file tax returns, much of the economy is farmers selling corn to their neighbors, so national income numbers are often a wild-assed guess.

Also the life expectancy numbers are not based on a rigorous probability analysis, but rather a simple model that takes the death rates of 15 year old people in 2015 and 45 year old people in 2015 and treats that like a Markov chain where you are pretending that the 15 year old in 2015 is going to die at age 45 in 2045 at the same rate that that 45 year old people die in 2015, which is just not true -- particularly if you consider exceptional events such as war, famines, etc.

Thus those graphics are great for a talk, but they don't have the real depth of knowledge you'd need if you want to sell something to those people, plan a business, run an effective aid program, etc.

Here's a list of all design books that I give to new hires:

Branded Interactions: Creating the Digital Experience - (https://www.amazon.com/Branded-Interactions-Creating-Digital...)

The Visual Display of Quantitative Information - https://www.amazon.com/Visual-Display-Quantitative-Informati...

Universal Principles of Design - https://www.amazon.com/Universal-Principles-Design-Revised-U...

The Interface: IBM and the Transformation of Corporate Design, 1945-1976 - https://www.amazon.com/Interface-Transformation-Corporate-19...

Multiple Signatures: On Designers, Authors, Readers and Users - https://www.amazon.com/Multiple-Signatures-Designers-Authors...

Change by Design: How Design Thinking Transforms Organizations and Inspires Innovation - https://www.amazon.com/Change-Design-Transforms-Organization...

Thoughts on Design - https://www.amazon.com/Thoughts-Design-Paul-Rand/dp/08118754...

Notes on the Synthesis of Form - https://www.amazon.com/Notes-Synthesis-Form-Harvard-Paperbac...

..and a list of ones I'm considering adding:

Unflattening - https://www.amazon.com/gp/product/0674744438/ref=oh_aui_deta...

Creative Confidence: Unleashing the Creative Potential Within Us All - https://www.amazon.com/gp/product/038534936X/ref=oh_aui_deta...

The Design Method - https://www.amazon.com/gp/product/0321928849/ref=oh_aui_deta...

Product Design for the Web: Principles of Designing and Releasing Web Products- https://www.amazon.com/gp/product/0321929039/ref=oh_aui_deta...

I very much like the aim and contents of this book. There are loads of data visualization basics that this tutorial gets right, like: In bar charts, however, it’s almost always best to make zero the y-axis minimum.

Or, regarding pie charts, humans are not particularly good at judging the relative size of areas

Or about bubble charts, We should never use the extra bubble chart dimensions to convey critical data or precise quantities, therefore. Rather, they work best in examples such as this example—neither the exact wind speed nor the specific classification need be as precise as the location.

The author does a great job of covering many different types of charts and graphs. There's also a good focus on getting rid of chartjunk, though the author doesn't go far enough, leaving in extraneous shadows and other elements (and I have to question the use of a library that requires so much effort to remove the default chartjunk).

Unfortunately there are a number of wider issues that obscure these valuable insights. Many the data visualization concerns are buried beneath a mound of introductory web development information that vascillates between explaining the nature of foundational technologies like AJAX, SVG and CDNs and assuming that the reader is an avid jQuery user. The last chapter is a treatise on MVC application development. I'm not quite sure who the target audience really is.

But the worst offense committed by this tutorial is the emphasis on overly simplistic charts. From the introduction: Effective visualizations clarify; they transform collections of abstract artifacts (otherwise known as numbers) into shapes and forms that viewers quickly grasp and understand. The best visualizations, in fact, impart this understanding subconsciously. And later (in defense of pie charts), the author advocates for a graph that epitomizes the idea of chartjunk: a 400x400 pixel circle that gives no more information than the number 22.4%.

This, I think, is where the true challenge in data visualization comes: not in producing a pretty chart to display whatever information's at hand, but to DROP the chart if it doesn't truly add value. Waste no ink producing a chart that is better left as a table of numbers (e.g. a reference), and certainly do not waste the viewer's time with a chart showing something that's more effectively communicated in prose. The answer to the question of how much of the world lives on $1.25 a day is 22.4%. A chart simply cannot illuminate a single data point.

Where data visualization gets really interesting is when you maximize the amount of information conveyed. Don't waste the reader's time producing these USA Today (or The Onion) style bar charts. A handful of numbers is best presented as numbers. Seven data points is TINY, not a moderate size appropriate for a bar chart, as the author would have you believe.

Effective data visualizations demand the viewer's careful study. If a reader can completely understand a chart subconsciously, there's probably no need for a visualization at all. Effective data visualizations are information-rich: they have a data density far exceeding that of prose. Good charts are exceedingly multivariate - small multiples are probably the best example. If your charts don't meet this standard, you're wasting your time producing them, and you're wasting your reader's time forcing them to parse a visual representation of what should be simpler prose explanations. If your headline contains as much information as your chart, drop the chart.

A great example of this issue comes in the section on scatter plots. After charting the relationship between health care spending and life expectancy, the author glibly declares In this example, we can see how life expectancy relates to health care spending. In aggregate, more spending yields longer life. However, that's only the least interesting factoid (as in, plausible-sounding inaccuracy) you could glean from this chart. There are many interesting questions that could be asked, but of the three the author poses, only one is answered: who the heck is that outlier that spends 50% more on health care than the pack yet has a lower than average life expectancy (spoiler alert: it's the United States). After demonstrating how to highlight this one data point, the author doesn't bother to explain why it's worth plotting all the others, or how this graph explains the situation uniquely. So once again we have produced a chart with basically one piece of information: that the US healthcare system sucks. This somewhat obvious insight simply isn't worth the ink spilled.

This emphasis on overly simplistic visualizations is entrenched in the choices of libraries. This is not to pick on Flotr et. al, they're good for what they do, but what they don't is far more important. Like essentially every dedicated charting library, you are restricted to just a handful of options that the developer allows you to have. You can choose from a few preselected chart types and customize them in a few preselected ways. If you have a novel dataset that requires anything more customized or unique, you are up a creek without a paddle. Such charting libraries require that you convolute your data until it fits its own assumptions. Nowhere is this better illustrated than this tutorial series, where the differences between iterations are frequently simply a reorganization of the data structure (i.e. a waste of developer time).

Far better, then, to use a generalized data library that allows you to manipulate your visual tools with endless freedom. D3 is a great example of such a library. If you're writing a data visualization tutorial in JavaScript that uses anything but D3 you have a burden of proof to demonstrate why. This is not because D3 is the end-all-be-all of charting libraries, but rather because any tutorial based on the limited selection of possibilities afforded by anything else is simply not a data visualization tutorial. It's a charting options tutorial (also valuable, but far narrower in scope).

The author finally gets around to talking about D3 near the end of the book, but misses the opportunity to demonstrate how simple it actually is to replicate the early examples with D3. It's also worth noting how understanding these composable techniques from the start gives you significantly more power. I really wish this book started with the tutorial on D3; everything prior just seems anachronistic.

For more information on information-rich data visualization, check out the work of Edward Tufte, in particular his book The Visual Display of Quantitative Information [0] (which the author cites but seems not to have read). For more information about JavaScript data visualization with D3, read everything you can find by Mike Bostock [1].

[0]: http://www.amazon.com/The-Visual-Display-Quantitative-Inform... [1]: http://bost.ocks.org/mike/

p.s. please don't make a map assuming latitude/longitude == x/y, even a really small one. A decent geographical charting library exists and is free, so just don't hack it.

p.p.s. please don't use radar charts.

p.p.p.s. d3.extent.

p.p.p.p.s. Bower isn't a part of Yeoman.

craigching
I think this is a great comment on data visualization. And there is one part that I want to highlight as relevant to me right now:

> Far better, then, to use a generalized data library that allows you to manipulate your visual tools with endless freedom. D3 is a great example of such a library. If you're writing a data visualization tutorial in JavaScript that uses anything but D3 you have a burden of proof to demonstrate why. This is not because D3 is the end-all-be-all of charting libraries, but rather because any tutorial based on the limited selection of possibilities afforded by anything else is simply not a data visualization tutorial. It's a charting options tutorial (also valuable, but far narrower in scope).

Yes! When I saw this article was based on flotr2, I immediately checked to see if it was based on D3. It was not (correct me if I'm wrong) and I was a bit disappointed because flotr2 appears to be all about charting, but data visualization is much more than just charting.

I'm looking for a good charting package right now, but my requirements are that it's based on D3 so that I don't have to introduce two different libraries when I need some data visualization that goes beyond mere charting. So NVD3 appears to be my choice at the moment.

sathomasga
Hi couchand,

Thank you very much for sharing your thoughts on the book. It's really gratifying when someone takes the time to seriously consider an author's work and then takes the additional time to compose a thoughtful critique. It seems pretty clear that the book you wanted to read was not the book that I've written, but that's actually a pretty good thing. Your comments will definitely help me clarify the goals and approaches of the book when I flesh out the Introduction. (As most everyone probably knows, the Introduction is the last section that's written. What's there now is mostly just a placeholder; I'll write the real Introduction once the editorial and technical review are complete.) If you're interested, you can check the online version in a month or so to see how effectively the book meets its own goals.

For those folks that do want a book mostly devoted to D3.js, you probably won't be satisfied with my book. (In fact, the publisher and I had quite a bit of discussion about including any material on D3 at all. We finally concluded that any book on JavaScript data visualization couldn't ignore D3, so there is a chapter dedicated to the library.) The good news, though, is that you have lots of other options. Amazon lists at least nine books dedicated to D3. As an author myself, I don't feel comfortable making specific recommendations publicly (as that might imply a negative opinion of books not recommended), but anyone is welcome to contact me privately for my thoughts. (Contact info is in a comment below.)

Stephen

I don't think I saw The Visual Display of Quantitative Information by Edward R. Tufte (available at http://www.amazon.com/The-Visual-Display-Quantitative-Inform... ) on the list. The Boston Globe's review is 100% correct: it's a visual Strunk and White.
No, it's not common to see this kind of visualisation. Bad visualisations? Yeah, they're pretty common. The usual mistake is overuse of Pie Charts.

I'd recommend the OP (and anyone else who has an interest in communication) to read:

The Visual Display of Quantitative Information http://www.amazon.co.uk/The-Visual-Display-Quantitative-Info...

Information Dashboard Design http://www.amazon.co.uk/Information-Dashboard-Design-Effecti...

Now You See http://www.amazon.co.uk/Now-You-See-Stephen-Few/dp/097060198...

tgb
Ironically, the OP recommends that you read The Visual Display of Quantitative Information, as well. I suspect that the mediocre example (I thought it was reasonably readable and somewhat interesting) was there more as a demonstration of the fact that R makes non-scatter plots easily, too.
mrdub
It's from http://www.jasondavies.com/parallel-sets/ and is one of the example plots for http://d3js.org/
weaksauce
I must say that the icicle plot of the same data is way more readable. The crisscrossing lines are all but useless to get a feel of the data.

go to http://www.jasondavies.com/parallel-sets/ and click on icicle plot to see what is going on.

Edward Tufte was already mentioned and his books contains some really good examples

http://www.amazon.co.uk/Visual-Display-Quantitative-Informat...

I'd say, yes numerical is often just as good, but not always. Also, in my opinion, whether information is displayed numerically or visually, the key is to be as minimalistic as possible.

For all of those folks who want to get better at visualization, the canonical text is http://amzn.com/0961392142 (Tufte)
zeratul
Very useful. Thanks.

There are so many books on data visualization it's hard to judge just from the reviews which are the canonical textbooks. So far I was just recommended this book: http://book.flowingdata.com/

I'd recommend this Edward Tufte book for visual data design, it's very good:

http://www.amazon.com/Visual-Display-Quantitative-Informatio...

Also, FlowingData is a great site for data visualization: http://flowingdata.com/

None
None
HN Books is an independent project and is not operated by Y Combinator or Amazon.com.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.