Hacker News Comments on
Building Software Systems At Google and Lessons Learned
Stanford
·
Youtube
·
3
HN points
·
6
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this video.Probably this one: https://www.youtube.com/watch?v=1-3Ahy7FxscOr this one: https://www.youtube.com/watch?v=modXC5IWTJI (terrible audio)
Both are very old, but extremely well aged.
IMO, the secret behind Google Search is not the smartness of the algorithms, but how much of it is baked in those O(100ms) which takes Google machines to answer your query. That's why the links above are the true reason Google Search performs well.
Natural Language Processing and Information Retrieval state of the art is far beyond what Google/Bing/Yahoo/Yandex/Baidu employ. But, it's far too expensive to serve it at QPS & latency required for decent UX.
⬐ supernova87aHi, would you know if there are any good explanations on whether Google searches (or in general the search strategy) say, creates a general search result for any given query, which is then tweaked with customizations specific to individuals / locations / languages? I.e. they've "saved" the basic search output in advance so that core doesn't have to be run each time, and only adjust around the edges specific to a user?Or is that not how it's done, and each search, for a given person, follows the same process?
I've always been curious about that.
⬐ dannysullivan⬐ 12907835202There's a lot of information on how Google Search Works on our site by that name here: https://www.google.com/search/howsearchworks/While there's some very short caching of results, to my understanding, there's generally still going to be a lot of hitting the index because there's just a lot of new information coming in all the time. We can't somehow store a set of results for say "cars" and figure it's going to be the same info from one minute to the next.
And results don't really have a lot of personalization for individuals. When you see differences, it's usually due to language and location.
I'd be super interested in a higher quality delayed search. Just each day you could go and look at your search queries from the day before and it would list any better results it found with more time.Then again maybe i'm underestimating how often I don't know exactly what I'm looking for and just grab the first result.
There is a lecture from Jeff Dean about it: https://www.youtube.com/watch?v=modXC5IWTJIIt's from 2010, but the fundamentals probably haven't changed.
The Jeff Dean version is public, but does not include cost in dollars
Google is famous for publishing a lot of material every year.A couple of related publications you might be interested in:
1. The Datacenter as a computer: https://ai.google/research/pubs/pub41606
2. The SRE book: https://landing.google.com/sre/sre-book/toc/
A couple of dated presentations:
1. Jeff Dean on building software systems at Google (2010) at Stanford: https://www.youtube.com/watch?v=modXC5IWTJI
2. Notes by James Hamilton on a talk by Jeff Dean on Google Datacenters in 2008: https://perspectives.mvdirona.com/2008/06/jeff-dean-on-googl...
--
Microsoft Research pretty much work on bleeding edge tech, as well and publish about some of their research pretty regularly. Here's a staring point to help anyone interested find relevant publications: https://www.microsoft.com/en-us/research/research-area/syste...
In addition to Linus's git talk, I really enjoyed Jeff Dean's EE380 retrospective on Building Systems at Google (http://m.youtube.com/watch?v=modXC5IWTJI). Many people have mentioned Jeff's basic premise elsewhere ("Design a system for 10x your current need, but not 100x, rewrite it before then") but this talk gave several useful examples where tipping points occurred (at least with Search).
Also recommend Jeff Dean's tech talk Building Software Systems At Google and Lessons Learned which references those latency numbers.Slides: http://research.google.com/people/jeff/Stanford-DL-Nov-2010....
Video: http://www.youtube.com/watch?v=modXC5IWTJI
Also, a previous thread on latency numbers:
⬐ colin_scottYuppers, idea taken directly from Jeff's talk: http://colin-scott.github.com/blog/2012/12/24/latency-trends...