HN Books @HNBooksMonth

The best books of Hacker News.

Hacker News Comments on
High Performance Web Sites: Essential Knowledge for Front-End Engineers

Steve Souders · 5 HN comments
HN Books has aggregated all Hacker News stories and comments that mention "High Performance Web Sites: Essential Knowledge for Front-End Engineers" by Steve Souders.
View on Amazon [↗]
HN Books may receive an affiliate commission when you make purchases on sites after clicking through links on this page.
Amazon Summary
Want your web site to display more quickly? This book presents 14 specific rules that will cut 25% to 50% off response time when users request a page. Author Steve Souders, in his job as Chief Performance Yahoo!, collected these best practices while optimizing some of the most-visited pages on the Web. Even sites that had already been highly optimized, such as Yahoo! Search and the Yahoo! Front Page, were able to benefit from these surprisingly simple performance guidelines. The rules in High Performance Web Sites explain how you can optimize the performance of the Ajax, CSS, JavaScript, Flash, and images that you've already built into your site -- adjustments that are critical for any rich web application. Other sources of information pay a lot of attention to tuning web servers, databases, and hardware, but the bulk of display time is taken up on the browser side and by the communication between server and browser. High Performance Web Sites covers every aspect of that process. Each performance rule is supported by specific examples, and code snippets are available on the book's companion web site. The rules include how to: Make Fewer HTTP Requests Use a Content Delivery Network Add an Expires Header Gzip Components Put Stylesheets at the Top Put Scripts at the Bottom Avoid CSS Expressions Make JavaScript and CSS External Reduce DNS Lookups Minify JavaScript Avoid Redirects Remove Duplicates Scripts Configure ETags Make Ajax Cacheable If you're building pages for high traffic destinations and want to optimize the experience of users visiting your site, this book is indispensable. "If everyone would implement just 20% of Steve's guidelines, the Web would be adramatically better place. Between this book and Steve's YSlow extension, there's reallyno excuse for having a sluggish web site anymore." -Joe Hewitt, Developer of Firebug debugger and Mozilla's DOM Inspector "Steve Souders has done a fantastic job of distilling a massive, semi-arcane art down to a set of concise, actionable, pragmatic engineering steps that will change the world of web performance." -Eric Lawrence, Developer of the Fiddler Web Debugger, Microsoft Corporation
HN Books Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this book.
If your website is well-coded and administered, does CloudFlare offer any performance benefit? (leaving aside security for now)

If a page is static, then CloudFlare can cache it. But if you set your cache headers appropriately, and use efficient serving code like nginx, I imagine serving static content is pretty darn cheap.

If a page is dynamic, then how can CloudFlare really speed it up? You don't want them serving stale pages to users. So it has to hit your server every time, in which case the user might as well hit your server. In that case, I don't really see how CloudFlare improves things.

Am I misunderstanding how CloudFlare works? It seems like if you follow typical performance tips like [1] then most of CloudFlare's benefit is eliminated.

I guess [1] does tell you to use a CDN. You can save end user network latency for cached static pages, since they cache them in multiple geographic locations. But if you have a simple site with 1 .js and 1 .css file per page, and compress and minify everything, I wonder if it's worth it.

[1] http://www.amazon.com/dp/0596529309

adamt
> If a page is static, then CloudFlare can cache it. But if you set your cache headers appropriately, and use efficient serving code like nginx, I imagine serving static content is pretty darn cheap.

With the static content it's not the cost of serving it, it's the fact that Cloudflare is serving it from a large bunch of distributed servers that are likely to offer far lower latency to the end-user than your servers. With modern web pages often containing hundreds of objects, this can make a big difference to page load times.

If all your customers are in one geography this is less of an issue, but if you have a global audience this can make a huge difference.

chubot
OK, but this is just the benefit of any CDN, not just CloudFlare right?

So I guess the selling point of CloudFlare is that it's like a normal CDN, plus it offers security services like DDOS protection?

With a normal CDN, you don't change your DNS to point at their servers right? DNS points to your server, but you change your code to have <img src="" > and so forth pointing at their servers. To me that just seems a lot less invasive, but admittedly then you can't get the security features.

geofft
I'd say that the historic Akamai model (from the last bubble) counts as "a normal CDN", and in that case, you do change your DNS to point straight at Akamai's DNS servers. For well over a decade now, `www.microsoft.com` has been a CNAME for something run by Akamai.

My rough and uninformed impression is that for small-time users, where "small-time" includes the scale of Reddit, CloudFlare can do what Akamai does at a much more reasonable price.

But yes, that's a different sense of "CDN" from e.g. Google-hosted jQuery.

chubot
Oh really, I didn't know Akamai worked like that.

Do you know of any docs describing the algorithm Akamai or CloudFlare use to decide whether to hit your origin server? I'm a stickler for correctness but it seems to be pretty easy to get into the situation where your users aren't getting what you intended. HTTP cache headers are a mess.

To me it seems safer to code your application so the HTML points to JS/CSS/PNG with content hashes in the URL. Then you don't have any cache expiry issues -- nothing ever expires, but you control the assets exactly through your dynamic HTML.

I think it's important to have fast and reliable rollbacks. You can imagine some situation where private content or offensive text is accidentally included with some static asset... I would like some guarantee about when users stop seeing it (preferably as soon as the application is redeployed).

geofft
Akamai has an API that you can hit to request that they purge some content of yours from their cache:

https://api.ccu.akamai.com/ccu/v2/docs/

trashcan
It looks like they determine whether or not the content is static by the extension: https://support.cloudflare.com/hc/en-us/articles/200172516-W... "CloudFlare does not cache by MIME type at this time."

And most likely default to not caching if they are unable to determine.

For dynamic pages you can tell them which ones can be safely cached with PageRules (on paid plans): https://support.cloudflare.com/hc/en-us/articles/200172826-C...

Nyr
> CloudFlare can do what Akamai does at a much more reasonable price.

Not really. Akamai has presence inside many "too big to peer" networks, while CloudFlare doesn't (for now).

You can't just compare CloudFlare to Akamai at the moment, they are following very different strategies.

manigandham
CloudFlare seems to be rapidly expanding both datacenters and peering partnerships. Isnt the effective capability the same between both companies?

Is there something that Akamai can do that CloudFlare cant?

Nyr
There is something which Akamai does and CloudFlare doesn't want to (at the moment): pay big sums to those big ISPs which refuse to peer and want to sell very expensive direct connections.
manigandham
What does this mean for the company/application using CloudFlare or Akamai?

Both services seem to be highly available and very fast from around the world. What does Akamai do here that's better?

Nyr
Akamai has direct connections to those "difficult" ISPs, CloudFlare uses transit to reach them.

To provide an example: transit to DTAG via GTT (CloudFlare) is always saturated at peak hours. So Akamai has a big advantage (direct connection) if you want to reach Deutsche Telekom customers.

pandemicsyn
I assume its for this reason (and others) that even Amazon.com still uses Akamai. Stuff like product images/banners still get served from domains like g-ecx.images-amazon.com (which resolve to Akamai)
andreyf
Well, routing everything through them via DNS obviously means less code in your application, so that's nice.

Also, a CDN which routes all of your traffic via DNS can also take advantage of a private network to get the packets to your users faster, i.e. Cloudflare could potentially own a faster link between Virginia and Berlin than the public internet would take, and lower response time that way. I think that's what point two on the benefits list here is about: https://blog.cloudflare.com/cloudflare-is-now-a-google-cloud...

mixonic
A faster link, but also an already open TCP and SSL connection. By moving the TCP and SSL handshake to an edge-based server, you have much lower-latency when starting a request for the client. The long-haul high-latency connection between the edge and your server is already open, no extra roundtrips required.
brandon272
I was using CloudFlare on a site that has a wide overseas following with a lot of visitors from different continents. I was getting a lot of complaints that the site was terribly slow. When I went back to just serving the site directly from it's server in Atlanta, the complaints ceased.

Since then I have been hesitant to use it again.

None
None
pakled_engineer
You get a lot of "testing your browser" delays if you're from certain regions, also the annoying captcha gateway if behind a big mobile NAT that was blacklisted somewhere because of one spammer or maliscious user, or if using some VPNs/Tor especially the free tier VPNs
supersan
Oh so that's what it is. I think Namecheap must be using it too because I see this darn message every time I load their site "We're testing if you're human or robot" and it wastes like 10 seconds of my time.

People think that "loading gears" animation on Google blog is annoying, think if it showed up every time for 10 seconds on a company's homepage.

It is the only reason I've considered leaving Namecheap so many times (but haven't because except for this they are better at everything else)

manigandham
This is part of their security features. All of them can be disabled if you want.
bad_user
> if you set your cache headers appropriately, and use efficient serving code like nginx, I imagine serving static content is pretty darn cheap

If the website is serving content (i.e. articles, images, movies, you know, the normal use-case) then most people visiting a page will be first time visitors on that page. The cache headers you mention are only good for returning visitors and even so, the local cache is not reliable on mobile phones where the cache is being purged regularly to make room. Consider that there are mobile web developers that have decided to not use JQuery for this reason, even though JQuery is probably the most cached piece of JS in the world.

Also serving content from a properly configured Nginx doesn't help with network latency. Say, if your server is in the US and your visitors are in Japan or China, then the added network latency can be measured in seconds. The problem gets even worse for HTTPS connections because of that handshake. And consider that Google found an extra .5 seconds of latency in delivering their search result costs them a 20% drop in traffic, or that for Amazon 100ms of added latency costs them 1% in sales.

> If a page is dynamic, then how can CloudFlare really speed it up?

Even if the page contains dynamic content, you always have static content that you want to serve from a CDN.

You also forgot probably the biggest benefit for us - bandwidth ends up being freaking expensive and if you get a lot of traffic, then a CDN can save you a lot of money.

hyperpape
There's a nice facebook study on how frequently people were not receiving cached images, and it was around 44% of all users (25% of all hits). If you figure that Facebook is going to have some of the better numbers for that metric, it supports your conclusion: https://code.facebook.com/posts/964122680272229/web-performa....

There was a similar exercise done with hosted versions of jquery, but I can't remember who did it or find a link, I'm afraid.

LoSboccacc
we use it on our image heavy startup. reduces the load on the server by a fair margin, and since images are loaded on S3 it also reduces the operating cost quite a bit

additionally it's geolocated, so we get that for free, which is nice.

waffle_ss
How are you using CloudFlare for S3 images? Are you using Flexible SSL?

The problem I ran into was setting up a CNAME for an S3 bucket requires the bucket name to have periods in it, but https:// access no longer works for buckets with that naming convention[1]. So I ended up having to use CloudFront instead for my images.

[1]: https://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestri...

ehPReth
There's an alternate URL prefix you can use if your bucket name has periods: http://documents.scribd.com.s3.amazonaws.com/docs/91ltsqq5fk... becomes https://s3.amazonaws.com/documents.scribd.com/docs/91ltsqq5f...
waffle_ss
If you use s3.amazonaws.com then you aren't using a CNAME anymore, meaning you can't use CloudFlare to accelerate the traffic. CloudFlare needs to control the DNS.
LoSboccacc
there is a middleware included :P

we have a document store backed by S3 running on EC2 instances, the instances are behind one of amazon balancing routers and that's what the cloudflare cname points at

currently the beta env runs flexible ssl and production runs with strict.

hewhowhineth
It kinda depends. You'd have to test it yourself. Here's CloudFlare's home page accessed from Europe a few moments ago, via 3G dongle.

http://s21.postimg.org/8gn6f7i2f/cloudflare_com.png

Nothing to write home about :)

That being said, I've seen CloudFlare cutting down DNS lookup from 800ms to 60ms for a tiny website.

Another thing is that it depends if you're really concerned with visitors far from your server. I had some WordPress websites hosted in LA and with some really basic optimization page speed was almost as good as Google's home page.

Don't drink the paint, I guess :) It may not be worth it, it may be great. Test it. Of course, CF has other benefits too, it's not just about the page speed.

Don't get me wrong. I'm not claiming anything here. It's just a quick rant and a screenshot. Don't take it too seriously.

Other than that, it is becoming somewhat concerning just how much traffic goes through CloudFlare. Nothing against you CF guys. Just good ol' paranoia :)

EDIT: For most places CloudFlare does a great, well, amazing job and keeps the page speed down to <1s, often <500ms. But again, it really depends where your visitor are. Check the History tab here http://tools.pingdom.com/fpt/#!/blmbP5/http://cloudflare.com

reissbaker
The biggest win from CDNs like CloudFlare as compared to just using properly-configured in-house servers is global distribution. Downloading a JS file from a server a mile from your house is a lot faster (often by hundreds of milliseconds) than downloading one from hundreds or thousands of miles away, and CloudFlare probably has a server a mile from your house. And it's prohibitively expensive to build your own global network that compares to CloudFlare's scale unless you're a very large company like Apple.

As for dynamic page caching, CloudFlare offers a service called Railgun that only sends the diffs of a page when it's been changed, rather than the full page, and then re-hydrates it at the edge of their network before handing it off to end-users. Theoretically this would reduce network time by sending less traffic inside the network. I've never personally used it so I can't vouch for it, but it sounds neat.

snowwrestler
If you leave aside security, of course Cloudflare is not going to look like a great benefit. Security is what Cloudflare provides as a service. The "CDN" benefits are a nice consequence of the security, but CDNs and static caching can be had for pretty cheap.

The real question is: why would you leave aside security?

beachstartup
you can cache specific urls with dynamic content on a short ttl and it will make a huge improvement in performance.
orionhenry
1. SSL is terminated near the user - the multiple round trips to do a TLS handshake for a user in Sydney will never leave the city, instead of having to traverse the pacific.

2. Static content is served locally from their CDN. Same thing, your JPEG served to a guy mombasa is coming from a few miles away, not half a world away.

3. If your clients are using old browsers without keepalive, CloudFlare will still keep connections alive from their local endpoint to your servers - making the new-connection cost only occur on the first couple of hops.

4. For dynamic content you can use a special proxy they created which keeps a synchronized cache with the far end so it can ships diffs. If you generate a page thats mostly similar to another page it can just send "Same As Cache Object 124567 except Line 147 says "Welcome chubot" instead of "Welcome orionhenry". A significant percentage of dynamic responses can traverse the world as a single TCP packet.

5. Their devs are really ruthless about keeping the crypto certs as small as possible, with the goal of all handshakes taking a single packet per step.

How does this (http://37signals.com/svn/posts/3112-how-basecamp-next-got-to...) get to be number one, while this http://www.stevesouders.com/blog/2012/02/10/the-performance-... goes nowhere?

Are we upvoting simply by name recognition?

http://stevesouders.com/about.php

He only wrote the book on high performance web sites...

http://www.amazon.com/High-Performance-Web-Sites-Essential/d...

awj
Sounders post is 5% halfheartedly evangelizing (maybe not-so) common knowledge and 95% reiterating his point about the split between frontend/backend time on fulfilling a request.

After reading it I'm pretty sure it could eliminate all but the last graph and be just as valuable at 1/4 the length. That he did a rather comprehensive evaluation is commendable, but I think the post loses track of its main point in favor of sailing the sea of data that was collected.

joshuacc
It's partly name recognition and partly novelty. The 37signals approach is unusual, while most competent front-end developers are already aware of the principles in Souders' post.
The definitive treatment of every item on that list, and more, is Steve Souders' book (http://www.amazon.com/High-Performance-Web-Sites-Essential/d...) Get it immediately and do what he says. In my experience, the differences are very material.

The one thing I would add is: wherever possible, give resources immutable URLs (that is, when a resource changes, change its URL) and tell the browser to cache them not for an hour but forever. This saves endless wailing and gnashing of teeth on both sides of the browser-cache abyss (i.e. things not being cached when you want them to and -- much worse -- things being cached when you don't want them to). Seriously, this rule changes misery to joy.

p.s. While copying the above link I noticed that Souders published a sequel (Even Faster Web Sites). Who here has read it? Can you report how good it is?

dunkjmcd
Even Faster is mostly JavaScript and image optimisations. Not quite the holy Grail of the first book but a good one to have in your collection none the less.
Get the two Souders (creator of YSlow, now works on web performance at Google) books, which are amazing. I'm not affiliated with Amazon.com, O'Reilly or Yahoo!; just like Patrick says, these guys know their shit:

High Performance Web Sites (O'Reilly, Sept. 2007): http://www.amazon.com/High-Performance-Web-Sites-Essential/d...

Even Faster Web Sites (O'Reilly, June 2009): http://www.amazon.com/Even-Faster-Web-Sites-Performance/dp/0...

Half of King's 'Website Optimization: Speed, Search Engine & Conversion Rate Secrets' (O'Reilly, July 2008) is devoted to performance, so it's also worth a look: http://www.amazon.com/Website-Optimization-Search-Conversion...

The book (O'Reilly's High Performance Web Sites: Essential Knowledge for Front-End Engineers): http://www.amazon.com/High-Performance-Web-Sites-Essential/d...

And his upcoming book (O'Reilly's Even Faster Web Sites: Performance Best Practices for Web Developers): http://www.amazon.com/Even-Faster-Web-Sites-Performance/dp/0...

HN Books is an independent project and is not operated by Y Combinator or Amazon.com.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.