HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Jim Keller: Moore's Law, Microprocessors, and First Principles | Lex Fridman Podcast #70

Lex Fridman · Youtube · 80 HN points · 53 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention Lex Fridman's video "Jim Keller: Moore's Law, Microprocessors, and First Principles | Lex Fridman Podcast #70".
Youtube Summary
Jim Keller is a legendary microprocessor engineer, having worked at AMD, Apple, Tesla, and now Intel. He's known for his work on the AMD K7, K8, K12 and Zen microarchitectures, Apple A4, A5 processors, and co-author of the specifications for the x86-64 instruction set and HyperTransport interconnect.

This episode is presented by Cash App. Download it & use code "LexPodcast":
Cash App (App Store): https://apple.co/2sPrUHe
Cash App (Google Play): https://bit.ly/2MlvP5w

PODCAST INFO:
Podcast website:
https://lexfridman.com/podcast
Apple Podcasts:
https://apple.co/2lwqZIr
Spotify:
https://spoti.fi/2nEwCF8
RSS:
https://lexfridman.com/feed/podcast/
Full episodes playlist:
https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4
Clips playlist:
https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41

OUTLINE:
0:00 - Introduction
2:12 - Difference between a computer and a human brain
3:43 - Computer abstraction layers and parallelism
17:53 - If you run a program multiple times, do you always get the same answer?
20:43 - Building computers and teams of people
22:41 - Start from scratch every 5 years
30:05 - Moore's law is not dead
55:47 - Is superintelligence the next layer of abstraction?
1:00:02 - Is the universe a computer?
1:03:00 - Ray Kurzweil and exponential improvement in technology
1:04:33 - Elon Musk and Tesla Autopilot
1:20:51 - Lessons from working with Elon Musk
1:28:33 - Existential threats from AI
1:32:38 - Happiness and the meaning of life

CONNECT:
- Subscribe to this YouTube channel
- Twitter: https://twitter.com/lexfridman
- LinkedIn: https://www.linkedin.com/in/lexfridman
- Facebook: https://www.facebook.com/LexFridmanPage
- Instagram: https://www.instagram.com/lexfridman
- Medium: https://medium.com/@lexfridman
- Support on Patreon: https://www.patreon.com/lexfridman
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Jim Keller talks about his impression of Elon after working with him - skip to 1:20:50 and watch the next 4 minutes of this interview: https://m.youtube.com/watch?v=Nb2tebYAaOA (the whole interview is full of effing amazing insight, IMHO).

Jim also talks about Elon’s engineering skills at the 50 minute mark for about 6 minutes in this interview: https://m.youtube.com/watch?v=1TmuJSbms9c

Elon seems like a complete tool to me. But here is Jim, a proven smart engineer-type, give a strongly positive opinion on Elon’s engineering ability. Many effective people have personality traits I abhor.

Unless you have worked with Elon yourself, and you are a good engineer yourself, you are just making shit up. I think it is normal to hate, but at least try to be honest.

ModernMech
I don't know, it's hard to trust anyone who has worked for a narcissist like Musk to tell the truth about said narcissist in public media. Powerful narcists love to surround themselves by people who are willing to use their credentials to launder the reputation of the narcissist. My favorite example comes from Dr. Deborah Birx regarding fmr. President Trump during the pandemic:

  "He's been so attentive to the scientific literature and the details and the data. I think his ability to analyze and integrate data that comes out of his long history in business has really been a real benefit during these discussions about medical issues.”
This is just about what Jim Keller says about Musk in the first video. You'll notice when he talks about what he learned from Musk he's very vague, and then he pivots directly to a story about how his friend invented a more efficient electric motor that was better than anything they had. That's a story about engineering prowess, but that story is pointedly not about Musk. Why does he need to pivot to another engineer when talking about engineering prowess? Because he doesn't have anything to say about Musk's engineering prowess, and can only speak to his technical philosophies. Of course those may be sound, and perhaps helps the engineering team, but having ideas about local maxima doesn't make one an engineer.

The second link isn't any better. Jim says Musk is the "real deal" and a great engineer, but instead of examples of that, just talks about how Musk prefers to receive presentations with solutions first, and got upset when they weren't frontloaded. Which is a fine point of view I guess, and maybe is related to managing an engineering process, but all the technical work was done by others.

And finally, as a robotics engineer myself, I have a very hard time listening to anyone from the Tesla autonomous group unless they want to explain why they are testing Tesla's beta quality AP (that is in fact lethal and has caused deaths) on the general public without their awareness or consent. Or explain why after the first decapitation caused by Musk's insistence on not using LIDAR in their cars, this group failed to step up and course correct. Instead they doubled down, and their decisions resulted predictably in nothing getting fixed, leading to a second decapitation. Who exactly made the call on those decisions? That's really all I need to hear from them.

https://twitter.com/ID_AA_Carmack/status/1038832124747571200...

https://twitter.com/lrocket/status/1099411086711746560?s=20

https://www.quora.com/Elon-Musk/How-did-Elon-Musk-learn-enou...

Jim Keller: https://www.youtube.com/watch?v=Nb2tebYAaOA&t=4851s

https://www.youtube.com/watch?v=Nb2tebYAaOA&t=3854s

None
None
xedeon
There are many other examples that can be found by just doing a quick search.

But it seems that some individuals even refuse to do so. They would rather stick to their strong (sometimes objectively flawed) opinions. Which is totally fine. Everyone is entitled to.

It's just embarrassingly bad when they try to regurgitate it to others and continue to double down. Even when presented with irrefutable data or new information.

bmitc
That’s fine you think that, but do you have anything that isn’t the same third party anecdotes that get repeated as nauseam? At least one of those is by an extremely unreliable narrator and an Elon sycophant and one of the others by another who thinks he’s knows everything as well.
xedeon
> but do you have anything that isn’t the same third party anecdotes that get repeated as nauseam?

First, can you give a "concrete example" on what you consider as a "concrete example"?

Because so far, I've only seen name calling and flippant remarks from you. I'm honestly trying to understand where you're coming from.

That way, we can start from a place of understanding. Instead of going in circles and you moving the goal post.

> I mostly post about free markets, and you had a general complaint about me, which is what I responded to.

As I said, your comment did not come across as a response.

I certainly haven’t noticed that you majority post about free market. A quick sample of your last 20 comments shows 1 low value comment about the free market.

~Half of your Your last ten comments are one-liner comments. Also plenty of opinion, some of which are stated as “facts” by you and could really do with some reference to supporting information. Your last ten comments are: 2x one-liner “facts” on inflation, a one-liner joke, 2 opinions on James Webb, a one-liner on ebooks, a two liner against magazine apps, a comment against government subsidies (with low-value political overtones IMHO), an opinion on Rocky & Android TV, one-liner “The telegraph network was the true origin of the internet”, one-liner “I'm sad that free markets are viewed as an ‘extreme’ position” (you derailed the article topic here).

> I'm curious why you believe free markets are absolute drivel.

rjbwork did NOT say that. You said that. I think that captures a perfect example of a one-liner comment of negative value to the HN community.

“absolute drivel” is inflammatory, but it is only their opinion on the quality of your comments, and I suspect the intention wasn’t to be a personal attack.

> I don't berate others, call them names, call them stupid.

Why introduce that? An implication that we do? I don’t think either of us are suggesting you do those things. I don’t think either of us have done that. We could both choose to be politer, but the risk is a tone of passive-aggressive condescension. Personally I think we are positively engaging with you because we have enough respect for you to do so. My time, your time, and the time of others is extremely valuable (and difficult to own).

> I also like the sport of debate

Let’s imagine there are two forms of debate:

1) the political/lawyer form where the game is to win, any tactic that works is valid, facts are often irrelevant, and competitive behaviours are everything.

2) the scientific/engineering form where the game is a search for answers, discovering one is wrong is fantastic (learning), and cooperation is critical.

I think you say you do (2). However you come across to me as doing (1). I have given you plenty of reasons in this thread backing up why your comments come across as disingenuous.

Sorry that this is a meta-discussion. I really do want to encourage conversation on HN to be curious and positive, and not snarky. I am not a mod (ugggh). I sincerely try to write high quality comments and improve my commenting, not that I am necessarily succeeding ;p. My original comment got 10 upvotes (in a slow thread), so I am not alone with my opinion about your commenting style.

Your comment “I'm completely consistent in what I see as true” is a possible signal that you are dogmatically sticking to your beliefs, and not allowing your beliefs to be changed by learning from others. To quote Jim Keller talking about himself: "Imagine 99% of your thought process is protecting your self-conception, and 98% of that is wrong.” — context @1:23:00 of https://www.youtube.com/watch?v=Nb2tebYAaOA

That is the end of this thread for me. I hope you have gained something from our comments.

Jim Keller talks about his impression working with Elon - skip to 1:20:50 of this (effing amazing) interview: https://m.youtube.com/watch?v=Nb2tebYAaOA

Edit: Also Jim talks about Elon’s engineering skills at 50 minutes into this interview: https://m.youtube.com/watch?v=1TmuJSbms9c

Disclaimer: I’m not an Elon fan at all.

RaiausderDose
thx, this interview seems great. Jim is a cool dude.
If you are bored, then it is you that is not stimulating yourself, not your job.

Any job can be fun and stimulating: ask anyone intelligent working in a monotonous job e.g. talk to smart minimum wage friends. The exception would be if your workmates are hell; that is not fun.

Make up some human challenges for yourself, while perhaps avoiding purely technical challenges (my assumption is that you are an engineer type). Try to make colleague X laugh every day, draw a cartoon for colleague Y, whatever, et cetera.

Richard Feynman played the bongos and picked locks.

Jim Keller is off-the-charts smart, yet he talks about the his joy of digging ditches, in and interview of Jim by Lex Fridman: jog to 1h16m40s in https://www.youtube.com/watch?v=Nb2tebYAaOA

Joel Spolsky (old skool!) writes about "My first real job was in a big, industrial bakery" - about 5 paragraphs from top in https://www.joelonsoftware.com/2000/04/10/controlling-your-e...

If you are thinking you want to do a startup, the best place to find a founder(s) is in a large company. Search for other colleagues you like working with and that have integrity, and especially pay attention to others with skills you are weak at (e.g. marketing if you are a tech guy). Eventually an opportunity will open itself with a colleague.

Ask others you trust what your weaknesses are, and challenge yourself to improve on those areas.

Finally, be careful of the siren call of money. You don't want to burn through your savings (resetting to zero at 30 was rather unpleasant for me). But also don't waste your precious time only on chasing money (I have also tried that, and while it has given me a certain amount of financial freedom, the journey was mostly unsatisfying).

Disclaimer: I am middle aged, and I myself have done some of the above, and I regret not doing other of the above!

If it was a Podcast [1] then it was Jim Keller. And if it was couple of years ago than the CEO of Qualcomm would be Steve Mollenkopf, not Cristiano Amon. And Steve Mollenkopf isn't an engineer so he is highly unlikely to ever say anything like that.

[1] https://www.youtube.com/watch?v=Nb2tebYAaOA

The article is dense, needlessly complexified, and impenetrable (requiring specific arts to read it), making the article useless.

This whole thread is brilliant - I really admire the one or two people that tried to summarise/translate sections. That said, I find something weird about people that write about imposter syndrome.

Something similar to imposter syndrome is normal for anyone who is highly skilled:

* To be highly skilled you are continuously improving you skills, by fixing the flaws in your work and fixing your own flaws, from the large to the small.

* To successfully fix flaws, you need to be able to recognise the flaws, in finer and finer detail, as you become more and more skilled. The irony is that while external parties can genuinely admire your work, you yourself can only see more and more that is wrong with your work.

* If you can’t see flaws, you don’t fix the flaws, and you don’t progress.

There is a gorgeous section of one interview with Jim Keller (a legendary/genius chip designer) where he talks about how he knew how deeply flawed his work is. He is really inspiring because he he is so insightful about his thought processes, and there is no entangled bullshit like the article we are commenting on. I also love this quote of his: "Imagine 99% of your thought process is protecting your self-conception, and 98% of that is wrong.". Quote is at @1:23:00 of https://www.youtube.com/watch?v=Nb2tebYAaOA

> By 2025, either these models will have hit a wall on diminishing returns and it will take a complete pivot to some other approach to continue to see notable gains, or the products will continue to have improved at compounding rates and access will have become business critical in any industry with a modicum of competition.

Is there a single example in AI, or even technology as a whole, where simply continuing to apply one technique has led to compounding growth? Even Moore's Law, according to none other than Jim Keller[1], is more a consequence of thousands of individual innovations that are each quite distinct from others but that build on each other to create this compounding growth we see. There is no similar curve for AI.

In this case, GPT-3 (released in 2020) uses the same architecture as GPT-2 (released in 2019), expanded to have ~100x more parameters. It's not hard to see that compounding this process will rapidly hit diminishing returns quickly in terms of time, power consumption, cost of hardware, etc. Honestly, if Google, Amazon and Microsoft didn't see increased computational cost as a financial benefit for their cloud services, people might be willing to admit that GPT-3 is a diminishing return itself: for 100x parameters, is GPT-3 over 100x better than GPT-2?

It seems that the big quantum leaps in AI come from new architectures applied in just the right way. CNNs and now transformers (based on multi-head attention) are the ways we've found to scale this thing, but those seem to come around every 25 years or so. Even the classes of problems they solve seem to change discretely and then approach some asymptote.

Copilot will probably improve, but I doubt we will see much compounding. My best guess is that Copilot will steadily improve "arithmetically" as users react to its suggestions, or even that it will just change sporadically in response to this.

[0]https://youtu.be/Nb2tebYAaOA?t=1975

kromem
My whole point is that the way in which AI progress can intersect itself can lead to compounding effects in ways technology has not.

One way to improve upon GPT-3 is almost certainly going to be adding discriminators into the mix:

"GeDi: A Powerful New Method for Controlling Language Models"

https://blog.salesforceairesearch.com/gedi/

And it may be that rather than it being a CNN discriminator such as in that case, it will be a second transformer:

"[2102.07074] TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up"

https://arxiv.org/abs/2102.07074

My point is we may not need to reinvent some new model separate from transformers/GANs and the work so far as long as novel ways of intersecting trained models produces results that can in turn feed intersect with other models.

NVIDIA in particular has had some interesting focus on using AI to generate training data for AI.

"NVIDIA Omniverse Replicator Generates Synthetic Training Data for Robots | NVIDIA Technical Blog"

https://developer.nvidia.com/blog/generating-synthetic-datas...

I think it's very unlikely we need to wait 25 years for a new paradigm.

In fact, I'm willing to wager a bet that whatever new paradigm will arrive will itself be the output of AI simulating different models.

kmod
There has been some effort to quantify "AI scaling laws": ie how much performance increases as we scale up the resources involved. See section 1.2 of https://arxiv.org/pdf/2001.08361.pdf

My main takeaway from that paper is that a 2x increase in training cost improves performance by 5% (100x by 42%). I only skimmed the paper though.

To me this says that model scaling will not get us very much farther: we can probably do one more 100x but not two.

I talked to someone working on model scaling and they see the same numbers and draw a very different conclusion: my interpretation of their argument is that they view scaling money as easy versus finding new fundamental technical advances.

baryphonic
I hadn't seen that paper before, but it is excellent. Thank you for sharing!

> To me this says that model scaling will not get us very much farther: we can probably do one more 100x but not two.

I totally agree with this assessment, and note the absence of a GPT-4 release.

ShamelessC
> my interpretation of their argument is that they view scaling money as easy versus finding new fundamental technical advances.

This paper only discussed transformers. Given the current pace of research, it's not a given that transformers won't be replaced by something else that has better scaling laws.

And indeed, this paper tells you the amount of compute and data you need for a transformer. But Moore's law isn't doing great lately. For the purposes of research,you may be able to train trillion-parameter models - but you will likely not run such a model on your phone.

In order to merit the large cost of both training and running predictions (which don't even fit the entire model in a single GPU for larger models) - models will need to become more parameter efficient than the vanilla transformer. Otherwise it's just too expensive.

I'm generally against rewrites as they're often shortcuts for new developers that instantly reject a code base as 'too complicated' without understanding it.

However, I have had success in rewriting my own code base a number of times. This is a unique situation of a one-man-band that understands the business case, handles support, and every bit that went into every decision of the code base.

With this kind of clarity, when you've spent years with the problem domain, you can uniquely rewrite a project to get at the real business case that, now with years of experience, you see what your customers actually wanted you to solve in the first place, or maybe what they evolved to want in the end.

Jim Keller advocates rewrites in a similar way as the only way to progress in chip design.

I think we could rewrite very basic things in exciting ways with this sort of attitude. For instance, we know a lot more about what we need in a desktop OS, an email client, a search engine, etc. Basic things that have gotten a lot of cruft over the years as they evolved. Taking a fresh look could be rewarding.

I guess one could reframe this as 'first principals thinking" but with the caveat that the problem needs to be truly understood.

Here's Jim Keller talking about it: https://www.youtube.com/watch?v=Nb2tebYAaOA&t=1361s

I love George, but he's a bit of a reactionary to a fault and this anecdote is a perfect example. A person with deep knowledge and thoughtfulness will make almost the exact same point with much more nuance, aka Jim Keller: https://www.youtube.com/watch?v=Nb2tebYAaOA&t=1363s

The choice quote in contrast to Hotz is "executing recipes is unbelievably efficient -- if it's what you want to do"

hnarn
There's definitely an assumption by Hotz that programming and solving "real" problems is what everyone should aspire to, and that anything else is just meaningless. Like anything in life, what's meaningful is of course completely subjective, all the way from some people actually finding it fulfilling to others just not being interested in putting in that much effort into their career and preferring to do other things with their time.
prescriptivist
The irony of this point is if you ever watch a livestream from Hotz, he is, at an amazing level, literally taping together frameworks and systems to graduate to a point where he can begin to express solutions to a real problem. It's one of his great strengths -- nothing cannot be accomplished through hours dedicated to a problem with the extant tools we have. If he wants to impugn the flawed systems we have at our disposal it's just because nothing exists that is in congruity with what is happening in his head.

He is, however he may dislike it, good at taping together frameworks and stands as a success case for the systems he might look down on.

Sep 25, 2021 · silisili on AMD’s Lisa Su
Wow, thanks for this. I'm a huge Keller fan and had never seen this interview.

The guy is so..normal/humble. Looks like someone who you'd get drunk with at some barbecue. Then he talks and just blows you away with how much he knows.

If you haven't already, highly suggest watching his Lex Fridman interview. I learned a lot from it, both about chips and Keller.

https://youtu.be/Nb2tebYAaOA

tester756
this one is cool too https://www.youtube.com/watch?v=oIG9ztQw2Gc
Jim Keller (famous cpu designer; Lex Fridman interview)[1]: "Really? To get out of all your assumptions, you think that's not going to be unbelievably painful?" "Imagine 99% of your thought process is protecting your self conception, and 98% of that's wrong". "For a long time I've suspected you could get better [...] think more clearly, take things apart [...] there are lots of examples of that, people who do that". "I would say my brain has this idea that you can question first [sic] assumptions, and but I can go days at a time and forget that, and you have to kind of like circle back to that observation [...] it's hard to keep it front and center [...]".

[1] https://www.youtube.com/watch?v=Nb2tebYAaOA&t=4962s

A lot of software development is just "executing recipes" and there's nothing wrong with that: https://youtu.be/Nb2tebYAaOA?t=1363
Jul 30, 2021 · anonymouse008 on B-tree Path Hints
I'm not qualified to truly understand this, but when Lex first interviewed Jim Keller, Jim basically said regarding processor design -- 'yeah, if you just guess the same result as last time, you get a 50% speed increase.'

First Interview (where that poorly paraphrased quote resides): https://www.youtube.com/watch?v=Nb2tebYAaOA&t=13s

Second Interview: https://www.youtube.com/watch?v=G4hL5Om4IJ4

sitkack
The Weather Prediction Algorithm, weather will be the same as yesterday. Only wrong on transitions, very useful when you have runs of the same state. After that you implement a Markov Chain [1]

https://en.wikipedia.org/wiki/Markov_chain

Jim Keller believes that at least 10-20 years of shrinking is possible [1].

[1] https://www.youtube.com/watch?v=Nb2tebYAaOA&t=1800

mirker
On the same podcast you can find David Patterson (known for writing some widely used computer architecture books), who disputes this claim.

https://www.youtube.com/watch?v=naed4C4hfAg

At 1:20:00

someperson
David Patterson is not disputing that there's decades left of transistor shrinking, he's just saying that the statement of "transistor count doubling every 2 years" doesn't hold up empirically.

David Patterson is saying he considers Moore's Law is dead because the current state of say, "transistor count doubling every three years" doesn't match the Moore's Law exact statement.

In other words, he is simply being very pedantic about his definition. I can see where he's coming from with that argument.

mirker
I don’t think it’s irrelevant to look at changing timescale. If the law broke down to be 3 years, there isn’t any reason it won’t be 4, 5, or some other N years in the future.
vlovich123
Every 2 years
Robotbeat
Right. But it is no longer 2 years so it’s not Moore’s Law any more.
zsmi
It's more than that though as it's important to remember why Moore made his law in the first place.

The rough organizational structure of a VLSI team that makes CPUs is the following pipeline:

architecture team -> team that designs the circuits which implement the architecture -> team that manufactures the circuits

The law was a message to the architecture team that by the time your architecture gets to manufacture you should expect there to be ~2x the number of transistor you have today available, and that should influence your decisions when making trade-offs.

And that held for a long time. But, if you're in a CPU architecture team today, and you operate that way, you will likely be disappointed when it comes to manufacture. Therefore one should consider Moore's law dead when architecting CPUs.

stingraycharles
And for those who don’t know, Jim Keller is a legend.

https://en.m.wikipedia.org/wiki/Jim_Keller_(engineer)

prox
That was a nice watch! Thanks!
agumonkey
the last 20 years people had serious doubts on breaching 7nm (whatever the figure means today) but, and even if Keller is a semigod (pun half intended) .. I'm starting to be seriously dubious on 20 years of continued progress.. unless he means a slow descent to 1-2nm .. or he's thinking sub-atomic electronics / neutronics / spintronics (in which case good on him).
dkersten
Since the "nm" numbers are just marketing anyway, I think they don't mean much in regards to how small we can go. We can go small until the actual smallest feature size hits physical limitations, which is so decoupled from the nm number that we can't possibly tell how close "7nm" is (well, I mean, we can, there's a cool youtube video showing the transistors and measuring feature size with a scanning electron microscope, but I mean we can't tell just from the naming/marketing).
rorykoehler
Check out the lex fridman Jim Keller podcast on YouTube
Nokinside
Jim Keller is legend in microarchitecture design, not in process technology. All his arguments seem to be just extrapolating from the past.

Process engineers&material scientists seem more cautious. I'm sure shrinking goes but gains are smaller from each generation.

TSMC 3nm Process is something like 250 MTr/mm² and single digit performance increase and 15-30% power efficiency increase compared to older prosess.

prox
The video that was posted goes into that (30min mark) and seems to reflect what you are saying.
agumonkey
he might know some about the material science behind things but yeah, that said I'd like to hear about actual semi/physics researchers on the matter
barbacoa
If we ever figure out a way to make caron nanotube transistors in volume, expect another 50 years of Moore's law.
tyingq
It does, though, reduce heat, right? Which ultimately is more cores per socket. Which hits the thing that actually matters...price/performance.
Nokinside
Yes. But that's a huge decline compared to even recent past.

Performance increases from generation to generation used to be much faster. TSMC's N16 to N7 was still doubling or almost doubling performance and price/performance over the long term. N5 to N3 is just barely single digits.

Every fab generation is more expensive than in the past. Soon every GIGAFAB costs $30 billion while technology risk increaseses.

Robotbeat
That’s true, but because Moore’s Law has slowed, you’ll be able to amortize that $30 billion over a longer time.
analognoise
Yeah and after you have a working $30B fab, how many people are going to follow you to build one?

The first one built will get cheaper to run every year - it will pay for itself by the time a second company even tries to compete. The first person to the "final" node will have a natural, insurmountable monopoly.

You could extract rent basically forever after that point.

fshbbdssbbgdd
I thought the drivers of cost are lots of design work, patents, trade secrets etc. involved with each process. If there’s a “final” node, those costs should decrease over time and eventually become more of a commodity.
labawi
That's only true if the supply satisfies demand.
atq2119
I don't think we'll see a final node in our lifetimes. Improvements are slowing down and will become a trickle, but that doesn't mean research stops entirely.

Consider other mature technology, like the internal combustion engine. ICEs have been improved continuously, though the changes have become marginal as the technology matured. However, if research and improvements on ICEs ends entirely it's not because the technology has been fully explored but because they're obsoleted by electric cars.

ac29
> because Moore’s Law has slowed

Not sure that is really true based on the data. Remember, Moore's law says the number of transistors in an IC doubles every two years, which doesnt necessarily mean a doubling of performance. For a while in the 90's, performance was also doubling every two years, but that was largely due to frequency scaling.

https://upload.wikimedia.org/wikipedia/commons/0/00/Moore%27...

Robotbeat
To be precise, Moore’s Law says the number of transistors per unit cost doubles (every two years). https://newsroom.intel.com/wp-content/uploads/sites/11/2018/...

A lot of the new processes have not had the same cost reductions. Also, some increase in transistor count is due to physically larger chips. Also, you have “Epyc Rome” on that graph, which actually isn’t a single chip but uses chiplets.

May 27, 2021 · mncharity on Eric Carle has died
Jim Keller (cpu designer): "I joke, like, I read books. And people think, 'Oh, you read books'. Well, no, I've read a couple of books a week, for [50] years."[1] Thought you might like.

[1] https://www.youtube.com/watch?v=Nb2tebYAaOA&t=5044s

Jim Keller talking about reading two books a week for 50 years

https://youtu.be/Nb2tebYAaOA?t=5048

> I expect within the next 6-15 years that we'll hit this limit

You are talking about fundamental physical limits? Um... No, [1]

"At about 26-28.5 minutes of the Jim Keller:

Jim notes that transistors could reach 10 atoms by 10 atoms by 10 atoms while avoiding quantum effects. People are also working on harnessing quantum effects."

We are so far away from fundamental limits. That even if we assume we could somehow double transistor density every two years, which we dont anymore, there are still at least another 15-20 years before we are close to that limit. And that is not accounting any 3D Transistors.

Realistically we already have Roadmap up to 2030 from TSMC. The only limit we will hit is that the node is too expensive and market could no longer afford the premium. Which You could expect to happen within next 10-15 years.

[1] https://youtu.be/Nb2tebYAaOA?t=1677

heimatau
> Jim Keller

I watched a bunch of Jim Keller's interviews and I just fundamentally disagree with ~'we'll innovate our way out of it'.

ksec
TSMC has a roadmap for next 10 years. Which is still far from reaching any technical roadblock.
Tesla is 1-2 years off afaict.

Also an interesting perspective from Jim Keller, that autonomy is easier than you might believe:

https://youtu.be/Nb2tebYAaOA

I think it was in this link but not 100% sure... I watched a bunch of his interviews.

Generally speaking, I think GPT3 shows the power of raw computer in ML: https://www.gwern.net/newsletter/2020/05

Jan 19, 2021 · kasperni on Intel Problems
Jim Keller believes that at least 10-20 years of shrinking is possible [1].

[1] https://www.youtube.com/watch?v=Nb2tebYAaOA&t=1800

You can have a sense of how it would be to work with him in this interview with Lex Fridman[1]. I'm also curious how it would be to work with him since it sounds like he is full of himself[3]. But he I believe he has the right for it since all what he has achieved[3]. The guy also read couple books a week[4] for last decades

[1] https://www.youtube.com/watch?v=Nb2tebYAaOA

[2] https://www.youtube.com/watch?v=Nb2tebYAaOA&t=3973s

[3] https://en.wikipedia.org/wiki/Jim_Keller_(engineer)

[4] https://www.youtube.com/watch?v=Nb2tebYAaOA&t=5040s

olau
I watched that interview half a year ago, and came away with the impression of a curious humble guy who was interested in digging deep below the surface and working on big advances.

I think you meant to refer to [2], and that part of the interview is Lex Fridman interjecting him when he was trying to make a deep point about how to think about things.

Jim is a tour de force: https://www.youtube.com/watch?v=Nb2tebYAaOA
mhh__
I wish the interviewer was more technical. He doesn't do interviews very often.

Lex has a PhD in machine learning but doesn't seem to be familiar with branch prediction, apparently.

wenc
I don’t see a problem with that. Lex’s podcast is aimed at a general technical audience of different backgrounds, so he often asks questions on behalf of listeners who are technical but might not be experts in the field. To be fair, machine learning and CPU design are vastly different fields with little overlap.

I mean, I have a PhD and had no idea what branch prediction was until I listened to that podcast.

specialist
If I was a curious noob, how would I tackle this?

My provisional answer is that I'd host a conversation between experts.

For example, I'm now very curious about Apple Silicon M1 wrt to Java's Memory Model and Project Loom (structured concurrency). But I don't know nearly enough to even ask smart questions, much less understand the answers.

So my dream future perfect interview would have Ron Pressler, Doug Lea, and one or two people really smart about M1 (the only name I know is Dan Luu) sit around and chat it up.

I'd ask them open ended questions, like "What's new and different?" "What happens next?" "What are you excited about?"

The conversations would likely happen over multiple sessions and different mediums. Because the experts would share and ask each other stuff which would prompt followups.

As podcast host, I'd try to be catalyst, try to remove myself from the convo as much as possible. I can't think of any examples, role models. While I'm a huge fan of Ezra Klein and Adam Gordon Bell (Corecursive), I'm not confident I could lean in like they do.

One tactic both Lex and AGB do really well is prompt their guests to explicitly define jargon. I suspect that some of the perceptions of Lex's ignorance are him trying to make topics more accessible. eg Working close to the metal with AI, I'm quite confident Lex knows about branch prediction.

agbell
I think you are right about Lex and also thanks for the compliment!

Even if you know about branch prediction, then asking the guest to explain it, maybe even pretending not to know about it, is a great way to have concepts introduced and make things more approachable.

Lex wouldn't be as popular as he was if he didn't have a good sense for the level of knowledge his ideal listener has about the subject.

newbie578
The part at 01:24:00 shocked me... Couple of books per week for 50 years... Damn man, that is a whole another scale.

I loved also how he explained why books are good, some takes 20 years of his experience and writes it in 200 pages...

texasbigdata
This was one of the most impressive things I've ever seen. The sheer mind puzzle exploration of "well..if we had a CPU the size of the sun, here's why it still wouldn't work". Guy seems unbelievably brilliant.
Merman_Mike
His comment about technology being a long, unbroken chain of abstraction layers changed the way I look at a lot of things in life.

Absolutely fascinating interview.

redisman
He also had the best answer to Lex's meaning of life question (last few minutes of the interview). Really made me stop and re-listen. It's very rare for someone on a podcast to think about every word they say.
aero-glide2
I'm not really a fan of Lex Fridman's style but this was a great watch
PartiallyTyped
Thank you. This is definitely one of the best interviews I have read. The precision of the statements and the usage of analogies to explain the topics is astonishing.
umvi
Is there a transcript of it?
PartiallyTyped
I very much doubt so. You could always the podcast and listen to it while you commute.
PartiallyTyped
I meant listened to. Sorry.
ZephyrBlu
Had no idea this man was so prolific, and Lex had interviewed him. I'm sure it'll be a very interesting interview.

Thanks for the link!

TheAlchemist
Very very impressive interview - thanks !
agumonkey
and his career a history book
arkj
This requires a post of its own. This is a must watch for anyone interested in cpu architecture. The clarity with with he talks about some of the complex problems in cpu design is brilliant. The interviewer does a decent job to make it more palatable for a larger audience.
Loved the episodes with Jim Keller [0] and Brian Kernighan [1] as well. Lex really does a great show and has amazing guests.

[0]: https://www.youtube.com/watch?v=Nb2tebYAaOA

[1]: https://www.youtube.com/watch?v=O9upVbGSBFo

Jim Keller: Moore's Law, Microprocessors, and First Principles - Lex Friedman

https://youtu.be/Nb2tebYAaOA

st1x7
Lex's podcast needs a special cut which includes only the guest's part of each conversation.
I agree, their problem wasn't lack of resources it was they stopped believing in Moore's law. They started to believe that 5 and 3nm advancements were not coming and didn't make the hundreds/thousands of new inventions needed to get them there. At least that's what I get reading between the lines here... Jim Keller talks about this in this in the Lex Fridman podcast: https://youtu.be/Nb2tebYAaOA?t=1892
Yeah that's how RISC vs CISC is taught in class, I've heard that same thing. I think it's an outdated paradigm though, if it's not been wrong all along. A CISC chip can still officially support instructions, but deprecate them and implement them very slowly in microcode, as is done with some x86 instructions. And a RISC chip manufacturer might just have used this phrase as a marketing line because designing RISC chips is easier than starting with a monster that has tons of instructions, and they needed some catchy phrase to market their alternative. They then get into the gradual process of making their customers happier one by one by adding special, mostly cold, silicon targeted for special workflows, like in this instance.

Ultimately, the instruction set isn't that relevant anyways, what's more relevant is how the microcode can handle speculative execution for common workflows. There's a great podcast/interview from the x86 specification author: https://www.youtube.com/watch?v=Nb2tebYAaOA

Carmack was actually mentioning he asked Jim Keller about Moore's Law. For more details about Jim's thoughts I recommend this interview https://www.youtube.com/watch?v=Nb2tebYAaOA
"Who invented <X>?" is an almost meaningless question. Edison and Swan bear the primary responsibility for producing practical and economical lightbulbs and the electricity distribution systems required for them. But both men were undeniably standing on the shoulders of giants.

The same is true for almost any invention. Most successful inventions aren't created by geniuses out of whole cloth. Rather, inventions are often re-combinations of a variety of pre-existing innovations in a novel way that are practical and efficient. We should give praise to the people who made the litany of small contributions and improvements before the invention, but also recognize that the credited inventor(s) usually made a contribution that was more than the sum of the components.

To use a different example less culturally connected to our time, the ideas of differentiation and integration (they weren't necessarily called this) were fairly well-known before Newton and Leibniz. Those two independently recombined the existing knowledge into a simpler formulation (the fundamental theorems or calculus) and thus get the credit. They made these two concepts highly useful. Yet neither of them could have done his work without thousands of mathematicians who made small mathematical discoveries or inventions in the millennia before them.

Or in another example, Jim Keller recently said that "Moore's Law" is not a result of one set of principles applied repeatedly over the decades, but a result of probably hundreds of individual innovations.[1]

I personally think we should credit teams of people, but perhaps list the inventor who was most successful at producing the technology economically at scale with primary credit.

[1] https://www.youtube.com/watch?v=Nb2tebYAaOA&t=1805s

Since you brought up Keller, if anyone is interested here is a great podcast/conversation/interview with Lex Fridman:

https://youtu.be/Nb2tebYAaOA

jiggawatts
One of the best bits I took away from that interview is Jim's comment that a lot of things need to be "rearchitected every decade, or even faster".

It's such a throwaway comment, but I think there's deep insight behind it from someone with extensive experience in the matter.

I also feel it applies to many other things, not just ASIC design.

For example, every customer I go to manages their VMware farm just like they managed their bare metal boxes in the past. Just as they're oh-so-slowly wrapping their heads around the design concepts that are a better fit for virtualisation, they're moving to the cloud. So of course, they deploy everything in the cloud just like they're used to in their on-premises VMware environment.

It's shocking how much low-hanging fruit is out there, just waiting to be rearchitected by someone with understanding and a clear vision.

As a random example: I just saw a bunch of Azure servers with empty 1TB "Application" D:\ drives. The team that built it was used to how VMware thin-provisioned drives only allocate 2MB blocks at a time, so the 1TB max capacity is just a high water mark that has no real cost. Meanwhile, Azure bills based on capacity, not usage, so that 1TB is burning money.

It seems like a small thing, but if you store 1GB in a 1TB volume in the cloud, you're not paying $123/TB/month, you're actually paying $12,288 per terabyte of data stored per month!

If you look around, it turns out that everywhere you look, people are doing "what they've always done", and it leads to crazy inefficiencies.

devonkim
The reality is most places keep running software architected for an era over 2 decades ago now and it’s ossifying harder if it’s still carrying enough business inertia. Those workflows of bootstrapping VMs and EC2 instances like it’s 1996 are not going away because to do anything cloud native in your architecture you need cloud native software, and usually if you can get a container you can get an RPM or Deb and play package jockey rejecting the new technologies literally meant to do half the work for you.

In most of the cases where places just dump money it’s usually a question of labor cost spent to optimize vs the gains, and unless your business is built around scaling a lot of small customers like the usual SaaS unicorns customer acquisition is super long, painful, and technical inefficiency is the default for enterprise as a rule. It’s worth paying $200 for a $1 part because the overhead and risk of renegotiating anything is not worth it. When an hour long meeting essentially costs a minimum of $1000 to a company essentially, it’d better be worth it.

When it comes to ASIC designs and VLSI the technical debt is pretty different because each generation of hardware has past benchmarks primarily to drive it forward. Oftentimes in software people tend to want to keep things the same which discourages innovation or touching.

I found this to be an excellent explanation for what is going on in these x86 CPUs that makes them so powerful:

https://youtu.be/Nb2tebYAaOA (Starting from ~6:30)

baybal2
The interviewer guy is weird
imtringued
Add timestamps to your link instead of messing around with parenthesis. https://youtu.be/Nb2tebYAaOA?t=390

Right click -> "Copy video URL at current time"

Jul 09, 2020 · jmole on The Bitter Lesson (2019)
Yup - and this year's top CPUs have almost 10x the performance per watt of CPUs from even 2-3 years ago [0]

Raw computation is only half the story. The other half is: what the hell do we do with all these extra transistors? [1]

0 - https://www.cpubenchmark.net/power_performance.html

1 - https://youtu.be/Nb2tebYAaOA?t=2167

fxtentacle
Any day now people will start compiling old programs to web assembly so that you can wrap them with election, instead of compiling them to machine code. Once that happens, we have generated another 3 years of demand for Moore's law X_X
Jul 09, 2020 · abetusk on The Bitter Lesson (2019)
Moore's law might be dead but the deeper law is still alive.

Moore's law is technically "the number of transistors per unit area doubles every 24 months" [1]. The more important law is that the cost of transistors halves every 18-24 months.

That is, Moore's law talks about how many transistors we can pack into a unit area. The deeper issue is how much it costs. If we can only pack in a certain amount transistors per area but the cost drops exponentially, we still see massive gains.

There's also Wright's law that comes into play [3] that talks about dropping exponential costs just from institutional knowledge (2x in production leads to (.75-.9)x in cost).

[1] https://en.wikipedia.org/wiki/Moore%27s_law

[2] https://www.youtube.com/watch?v=Nb2tebYAaOA

[3] https://en.wikipedia.org/wiki/Experience_curve_effects

edjrage
It really irks me that these things are called "laws". A law is something we expect to hold true forever, by means of the hypothetico-deductive scientific method.

They're phenomena. They're patterns we observe, and that's it. The pattern may change anytime, and that's something that should be expected. The causes may be known or unknown, but to call it a law may even make it hold true for longer, for "psychological" reasons. The law of gravity isn't influenced by what SpaceX investors think about it.

aszen
Agreed the cost aspect of Moore's law may continue to remain true, especially with chiplets with varying fab nodes and 3d architectures. Wright's law will also bring down costs as lower nm nodes mature.

But as mentioned in the comments below ai model training is increasing exponentially (compute required to train models has been doubling every 3.6 months) so it still far outstrips the cost savings.

My understanding is that architectural improvements (i.e. new approaches to detect more parts in code that can be evaluated at the same time, and then do so) need more transistors, ergo a smaller process.

(Jim Keller explains in this interview how CPU designers are making use of the transistor budget: https://youtu.be/Nb2tebYAaOA)

Jim Keller interview in same podcast is great too. https://www.youtube.com/watch?v=Nb2tebYAaOA
>Moores law is ending for everyone, not just Intel.

I am by no means an "in the know" on Chip design and this whole bit is probably a fair bit of speculation, but I remember Jim Keller talking about the ending of Moores law on a podcast in February[1]. If I remember correctly his argument boiled down to the theory, that Moores law is in some sense a self fulfilling prophecy. You need to have every part of your company believing in it, or else the parts stop meshing into one another well. I.e. if a team doesn't believe that they will get reach a density/size improvement, that would allow them to use more transistors in their design they will need to cut down and adjust their plans to that new reality. If this distrust in improvement spreads inside of a company, it would in turn lead to a steeper slowdown in overall improvement.

And while there may be an industry-wide slowdown at the current point in time, perhaps this dynamic is exacerbated at intel, causing them to loose their competitive edge over the past years.

[1]https://youtu.be/Nb2tebYAaOA?t=1805 (Timestamped to the beginning of Topic of Moores law slowing down)

Klinky
~ignore bad info~
wmf
Intel 10 nm does not use EUV.
Klinky
Thanks for the correction.
wtallis
Intel's 10nm strategy was basically to do everything they could to advance their fabrication process without having to use EUV. Some of those changes turned out to be bigger risks than EUV. TSMC was a bit less aggressive with their last non-EUV nodes, but it actually worked and now they have EUV in mass production (though few if any end-user products have switched over to the EUV process at this point).
Jun 19, 2020 · robocat on Researchers and Founders
I love this quote from Jim Keller:

"I imagine 99% of your thought process is protecting your self-conception, and 98% of that is wrong."

Quote is at @1:23 (during the last half hour where the interview is mostly philosophical) of https://www.youtube.com/watch?v=Nb2tebYAaOA

Nice post. The contrast with many others, reminded me of a recently-posted Jim Keller interview:

> His [Elon Musk] great insight is people are how constrained. I have this thing I know how it works, and then little tweaks to that, will generate something. As opposed to what do I actually want, and then figure out how to build it. It's a very different mindset, and almost nobody has it obviously.

[1] Short section starting at https://www.youtube.com/watch?v=Nb2tebYAaOA&t=3854 . Also liked the longer https://www.youtube.com/watch?v=Nb2tebYAaOA&t=4851 , emphasizing self deception and reading.

What makes this more bizarre than Keller’s typical short stints at previous companies is that he has done a ton of media in the last year. He’s probably given more time to journalists/interviewers in 2020 than in the previous 3 decades of his career combined.

A Fortune piece from May of this year gave some insights into his plans (as well as provided a nice overview of his career) (https://fortune.com/longform/microchip-designer-jim-keller-i...).

> Keller won’t talk much about the massive chip redesign he’s overseeing—chip designers seldom do—and Intel’s new chip probably won’t be ready for another year or two. Still, both Intel and Keller have scattered some clues about how the chips might work. The new chips will cleanly separate major functions, to make it easier for the company to improve one section at a time—an approach that evokes the chiplet model Keller used at AMD. Keller also hints that Intel’s low-power Atom line of chips may figure more prominently in his future designs for PCs and servers.

It doesn’t sound like at press time he was planning to leave.

Keller also did a great interview on Lex Fridman’s podcast, which was released in February of this year (https://youtu.be/Nb2tebYAaOA).

Keller then did a presentation at the Matroid conference (held at the end of February) (https://youtu.be/8eT1jaHmlx8).

I hope he’s ok, since the Intel statement specifically mentioned “personal reasons”.

jstimpfle
> Keller also did a great interview on Lex Fridman’s podcast, which was released in February of this year (https://youtu.be/Nb2tebYAaOA).

This is great. I'm 12 minutes in and I'm getting the kicks from the way this guy describes concepts (that I already know, more or less) in a simple way.

lozaning
The whole video is wonderful, really shows you how smart of a person Jim is, and as always Lex is a great interviewer.
stupidcar
Maybe the reason he was doing so much media was because he was encountering resistance to his ideas internally at Intel. Talking publicly about them and becoming the public face of Intel's CPU design program might have been his way to get more intellectual leverage. If it was, I guess it didn't work.
cepth
I can’t speak specifically to Intel, but at most large private companies when you’re an employee you give up the rights to talk to media/press about your work without employer approval.

For the Fortune and Lex Fridman interviews, Keller would likely have had to get Intel PR’s approval to participate. In Lex’s interview, you can see him wearing an Intel guest badge, so I assume the interview took place on-site.

UncleOxidant
Just listening to that Lex Fridman podcast. Keller is a very good explainer. However, I'm on the part where he says that you need to throw everything out every 5 years or so and rewrite and it makes me think that that would be an area where he would've gotten a lot of push-back at Intel.
baybal2
I heard that he had a speech impediment before. Never felt that he did when I first saw him speaking on hot chips.
SuoDuanDao
Hm, I remember a highly requested Joe Rogan guest talking about how he got over a speech impediment. Wonder whether there's a connection.
wand3r
Stamets?
SuoDuanDao
That's the one.

Making my predictions for 2020, I thought that co-operation with fungal systems would slowly ramp up this year. Probably just that I'm looking for evidence I was right, but it's a fun connection to draw about someone who seems in some way at least adjacent to that scene.

bArray
I've known quite a few friends and some family with speech impediments (maybe there's something in the water), but they all seem to do much better when they are in their wheel-house. I guess that public speaking on topics you are comfortable with could be quite positive. One of my family members was able to remove it entirely, he became a chef and was required to shout orders a lot.
Pietertje
Just wow, listening to the podcast interview is super interesting. The abstraction and knowledge on so many different levels and topics is thrilling.

Any other tips for interviews with similar architects?

cepth
There have been a couple others that come to mind.

With the CEO of Cerebras (a buzzy and well-funded chip startup) on the ARK Invest podcast: https://ark-invest.com/podcast/cerebras-wafer-scale-engine-a.... I will say that the interviewee here was a little bit more coy, and the podcast generally is geared more towards a business audience, though they have had very top tier technical talent on.

Also from ARK Invest was this discussion of the just launched Nvidia A100 GPUs: https://ark-invest.com/podcast/fyi-ep67-nvidia-gpu/.

From the Matroid conference, the chief architect at Groq (another buzzy chip startup): https://youtu.be/q-lBj49iF9w

Traster
His personal reasons may well include that he "personally hates working at Intel"
chooseaname
Could be that they brought him in for his expertise and then didn't let him use that expertise. Bad management has a way of doing that.
graycat
Being around big tech companies for years, for an explanation of what's going on it's easy to guess that maybe at Intel there is a status quo in practice solidly in charge that, in particular, believes that Moore's law is sick if not yet actually dead and, thus, wants to stretch out how much longer Intel can get good revenue with a sick or dead Moore's law.

Then for Keller and Intel, here is a possibility: Have the status quo at Intel bring in Keller and then ignore him so that (i) being at Intel he won't be doing things at a competitor that hurts Intel and its status quo, (ii) being ignored he won't be able to change the status quo at Intel, and (iii) being at Intel and ignored his career will stagnate so that in the future he will be no more threat to the Intel status quo.

That is, maybe the Intel status quo competes with people down the hall -- not the first case of that -- by ignoring them and for a chip architect competes with him by bringing him into Intel so that they can ignore him.

Broadly a new direction for him might be to quit being an employee, e.g., fighting the politics of the status quo, and start being an employer as a CEO of his own startup.

Quite broadly in the US, one of the keys to progress is to have lots of startups to get around whatever organizational dysfunction exists in older companies.

One of the reasons for this situation is the propensity of BoD members to be conservative, that is, risk adverse, to pay attention to the definite and well known bird in the hand, even if it is getting sick, and ignore the not well known and risky birds in the bush. In particular, such a BoD wants a CEO who just does a good job managing the existing business. Typically a BoD won't fire a CEO for failing to get new products into new markets but might fire a CEO who spends $100 million pursuing something new that fails. So, lots of CEOs just stay with the bird in the hand.

ncmncm
Indeed, suppressing upsets to the status quo was always a major purpose of corporate research labs like IBM Watson Labs, same Xerox, Bell, Kodak.

Corporate would spend any amount patenting things, but shelve every single thing. Innovation upsets the gravy train. So as long as you are on top of the market, change is inherently bad. Even a whole new, unrelated product line competes with existing products if a customer for both might be the same company.

A co-worker once sat listening to execs from EMC chatting about competitive threats: uniquely from other divisions of EMC. With 80% margins on existing product, nothing new looks attractive, nothing outside the company is competition, and it doesn't even matter what outsiders hear.

Such a company also benefits from buying up apparently competing companies and disbanding them or jacking up their prices to match. In the '80s, Mentor Graphics's business model depended on buying and shuttering Cadence-like companies. Cadence was the first one too big to do that to, and MG finally had to figure out another business when the gravy train dried up. They collected lots of rent until then, and the managers left flush and found other crooked opportunities.

It's easy to spot other companies in similar positions, past and present.

graycat
Yup, from my experience at one of the organizations you mentioned, I have to "agree with you more than you agree with yourself". Well put.
phlakaton
I find it interesting that (as of this point) comments that speculate on health are downvoted, but comments that speculate on his relationship with management are not...
UncleOxidant
Why not both? Bad management can lead to bad health. But I don't think it was only management - it was the rank and file that didn't want to change as well. Trying to fight an entrenched system could definitely lead to illness.
monadic2
Why is that interesting? Seems like normal, on topic behavior.
vulcan01
People tend to empathise with that which is more relevant to their own lives; more people will have a bad relationship with management than bad health.
phlakaton
Spoken like a young person who will no doubt live forever. ;-)
UncleOxidant
He was brought in not only for his expertise but in hopes that he could reform a sclerotic design process that's stuck in the 90s. AMD, Apple and Tesla (places Keller was before Intel) use a lot more design automation and thus are able to get more done/engineer. Intel has a patchwork of tools that are generally internally developed (and thus no one outside of Intel has expertise with those tools and expertise with those tools is not transferrable outside of Intel either) and different design groups have their preferred tools. From what I gather Keller was trying to bring in more industry EDA tools and standardize the design process between groups as well. From what I hear from people who were there he ran into all sorts of push back. So I gather he's decided to cut his losses and leave Intel to languish.
UncleOxidant
He was brought in to try to modernize Intel's processor design process. To bring it into the 21st century. AMD and Apple (places where he'd been before) use a lot more design automation and thus are able to produce much more per engineer than Intel. Intel tends to throw more bodies at the problem which doesn't scale well. From what I've heard from people there he ran into too much institutional inertia and outright pushback. If Keller couldn't do it then Intel is just going to remain stuck.
rathel
Might be true, recently there have been highly boasted improvements in the RTL-to-GDSII pipeline, using AI, as one does. Both from Cadence and Synopsys.
bredren
Big company, but having worked at and for Intel I would not be surprised. I don’t have nice things to say about Intel.
UncleOxidant
Same here. It's a terrible place if you want to try to get things done. Meetings about meetings and nobody ever wants to make a decision.
libertine
When you have FU money and the expertise to be desired by anyone, doesn't seem like a bad reason.
hitpointdrew
> and the expertise to be desired by anyone

Seriously...Jim Keller doesn't apply to work places....places seek him out. The interview process is Jim interviewing the company, not the other way around.

mabbo
There's something I can really respect about that, if that is the case. Imagine you hate working for your company, but you're still going out, giving positive interviews, and doing your job well.

I know a lot of folks (almost certainly myself included) who struggle to not make it obvious when they loathe coming to work in the morning.

toast0
One of my jobs, I was interviewing candidates on my last week. It was all fine, until the candidate asked "Why do you like to work on this team?" ... luckily we were at the end of time, so I could duck the question.
pkulak
That's insane that they had you doing interviews when they knew you were leaving.
ashtonkem
I got put on call during my last week once. My comment was “are you sure?”
toast0
Smallish team, I was good at interviews, and not disgruntled, sort of made sense? We had a quick feedback loop, so we would make our decision day of or next day.

This candidate was great, and from what I heard, she got an offer from my team and another team at the same company, but chose the other one; I probably didn't sell the position well enough?

Anyway, not the most insane thing Yahoo was doing at the time, lol.

rathel
Oh yeah. I have friends who work or have worked at Intel.

Management who runs this company are straight lunatics. The sheer amount of shelved projects, pointless reorgs, layoffs-for-show is staggering.

I do work in a corp, but thanks God our leadership is much more down-to-earth, even if I earn less money, I am able to retain relative sanity (due to my allergy to bullshit).

Jim’s talk with Lex Friedman was quite amazing. Highly recommend listening-

https://youtu.be/Nb2tebYAaOA

nickysielicki
I wish that the interviewer had more background in computer architecture. A lot of these questions wouldn't be asked if he took an architecture class in school.

Keller says a lot of interesting things in this interview that aren't followed up on. He calls for more substantial changes and architectural changes, I wonder what he thinks of spatial architectures.

InTheArena
Or at least listened a bit more - rather then "agree to disagree"...

That said, I haven't listened to him much, and he just grated me wrong here.

londons_explore
I think the interviewer 'acts' dumb, because then the resulting video requires less pre-knowledge to understand.

The video goes from something that 100,000 people with a computer architecture background might watch to something that 10,000,000 people with a tech background might watch.

specialist
Yup.

He asked plasma physicist Alexander Fridman (his father) to clarify what plasma is.

I love it.

Hearing Knuth, Penrose, Fridman's own definition of concepts of things I thought I knew is illuminating. Like allowing us norms to get a peek into their genius.

nickysielicki
That's fair, I think you're right.

Selfishly, though, as someone who doesn't work at Intel/nvidia/AMD but is interested in architecture and digital design, it's frustrating how hard it is to find candid opinions from industry experts in comparison to software. Computer architecture just isn't the same in terms of attitude.

One of my favorite things about the computer architecture courses I took was to sit after class and ask my professors about something I'd read about, and more often than not they'd tell me that what I read was bullshit and nobody seriously considered it in the industry, or they'd tell me about something they were excited about that I hadn't heard of. It's hard to get a handle on where the industry is heading and what important people see as the next steps to iterate on.

A general conversation is probably more useful to most people, but there are a lot of times in this conversation where you get the feeling that Keller is right on the precipice of giving an insider prediction of the future, and the topic shifts instead. It's frustrating because those types of conversations are just straight-up difficult to find elsewhere.

polytely
Have you listened to https://oxide.computer/podcast/ ?

It's has more depth than Friedman's podcast, I listen to it mainly for the computing history because they interview people about their careers. I hope they do another season soon

kasabali
This podcast has a similar problem too. Right when you think the guest is into something, Cantrill rants about a tangential thing and the topic chances afterwards.

I normally enjoy Cantrill's rants, I watch his talks on YouTube solely for listening him rant, but it gets annoying when interviewing a guest.

bcantrill
Sorry about that -- still learning how to effectively interview! (If there are specific examples you can cite, that would actually be helpful; I am trying to get better at this.)
kasabali
Hi, thanks for hearing!

Unfortunately I don't have specific examples in mind since it has been few months listening the podcast but I remember the episode with Jonathan Blow being a prominent one, because I was new to him, his rants and game development in general, and it felt like every time he was onto something interesting you were ranting about a related issue and after that he was proceeding with his next point without concluding the previous one. At least it felt to me that way.

I hope that didn't sound like I'm complaining. I'm a big fan of you guys and it's a great podcast. I could listen to 100 episodes :)

bcantrill
Yeah, the Jonathan Blow episode seems to be the one where that criticism comes up. We generally haven't edited them at all (one take, no cuts), and that one was just too long to realistically edit -- but we probably should have edited some of me out. The perils of a three hour conversation!
gyre007
I love all of Lex's podcasts. They've been truly mind-expanding for me personally.
data_spy
Lex is a bit quirky, but I like that.
castratikron
Amazing how he didn't know how to read until age 8, and then read two books a week up to now (about 50 years)
person_of_color
This baffles me.

How do you tradeoff with learning and “doing” i.e writing code or putting what you have learnt into practice?

I can’t finish a technical book without getting the urge to implement

ciarannolan
I just watched this entire thing. I'd never heard of Lex before.

It's pretty funny for the interviewer to say "agree to disagree" or "well, no" when he's clearly not the expert of the two.

gigatexal
amazing podcast episode for sure, I wonder where Keller will go next. It'd be insane if he went to AMD or Apple in their new ARM push.
imjasonmiller
> If you constantly unpacked everything for deeper understanding, you're never going to get anything done. If you don't unpack understanding when you need to, you'll do the wrong thing.

I really liked that quote. It's a great talk indeed.

op03
"Explore Vs Exploit Dilemma"
tumultco
It echoes a Confucius quote I like: “Learning without thinking is useless. Thinking without learning is dangerous.” (Analects 2:15)
You can't say that the "hodge-podge nature is most likely a historical artifact of the Unix evolution" and then say that it is "quality". You can't have it both ways. Random, unpredictable, historical quirks do not add up to quality. They add up to a mess.

It's like pointing to a shanty town and saying it is "quality housing" because it has a high population density.

I'm reminded of a recent interview with Jim Keller, who's famous for being one of the architects of the x86 instruction set, Apple's ARM CPUs, AMD's Zen architecture, and Tesla' AI chip: https://www.youtube.com/watch?v=Nb2tebYAaOA

An interesting throwaway quote, but actually a deep insight, is that he thinks a lot of things should be "redesigned every 5 years or so", but unfortunately are redesigned only every decade or longer.

I totally agree. PowerShell was a clean-slate design, and was called "Monad Shell" originally. It's elegant, and is based on a cohesive concept. It's now 10 years old and starting to accumulate... inconsistencies. Warts. It's in need of a clean-slate design again.

Bash and the GNU tools are the most random mess imaginable that can still function. Its components can trace their roots back to the 1960s. Ancient code that should have been rewritten from scratch a dozen times over. Like you said, it's "evolution", but evolution gave us the trigeminal nerve and the inverted retina. I want a designed system that doesn't take the scenic route and have things back to front because it was "always done that way and is too hard to change now".

If you got sat down and told to come up with a set of composable command-line tools for a shell, the end result would have nothing at all in common with the bash/GNU tools. There is nothing there worth repeating.

abetusk
I've seen that interview before and that quote in particular resonates with me but at the same time there are basic economics to consider.

I want to be clear, what I meant by 'quality' above is functioning systems. If there's a hodge-podge solution that gives you a 10x improvement at the cost of 1/2x the productivity vs. an "elegant" framework that gives you a 2x improvement at 1/10x the productivity by requiring to re-invent all the tools, the 'hodge-podge' solution wins.

To me, this is the same basic argument of Dvorak vs qwerty keyboards. Dvorak might be a 1-2x improvement over qwerty but qwerty wins because of cultural momentum. The improvement isn't worth uprooting the infrastructure already in place.

I think programming languages suffer the same fate, where the size of code base, libraries, etc. secures it as the lingua franca even if there are other languages that are marginally better.

As a rule of thumb, I think a replacement needs to be at 10x the improvement before it has a chance of uprooting entrenched systems.

The Unix system might be chaotic but it's had a lot of time to adapt to different needs. There's also a culture of freedom, sharing and experimentation that isn't present in the Windows world. These all have real world implications on what tools are available for experimentation, use and improvement.

> If you got sat down and told to come up with a set of composable command-line tools for a shell, the end result would have nothing at all in common with the bash/GNU tools. There is nothing there worth repeating.

Sorry, no. This sounds like an emotional plea rather than anything rooted in critical thinking. Sorting text files isn't worth repeating? Searching text files isn't worth repeating? Piping text from one process to another isn't worth repeating? I've tried to be even keeled in my responses but this is ridiculous.

I just wanted to say this about Intel vs AMD. Back when Intel was the "underdog" it couldn't compete with Athlon/Opeteron CPUs from AMD with its Pentium IV line, fan boys from both sides continually escalated the armchair warfare on Slashdot and Anandtech forums when Intel finally came out with the Conroe architecture that blew the competition away. I've seen this pendulum swing from one side to the other and back. Putting Intel marketing (awful) and AMD's marketing (awful again...), and their fan bases (toxic) aside, can we acknowledge the fact that working on a computer architecture is an extraordinarily complex task and that requires brilliant people all working together? I'll defer this fascinating topic, which most software engineers and tech enthusiasts are completely and utterly unaware of, to this talk by the legendary chip designer (x86 spec coauthor, A4/A5/Zen/Tesla AI ASIC architect) - Jim Keller and his interview with Lex Fridman: https://www.youtube.com/watch?v=Nb2tebYAaOA

Designing & manufacturing computer chips is hard af, having worked in semiconductor manufacturing for over 12 years (primarily on the backend side of the Fab).

Edit: Redacting some specifics

ksec
I wouldn't use Underdog to describe Intel, they were never one in their x86 history. Always the TopDog.

Intel in the P4 era is a lot different to now. P4 was an engineering mistake. And it wasn't Cornoe that saved that, it was Pat Gelsinger's Pentium M or Banias / Dothan.

The current ( or past ) Intel issues isn't engineering, they have amazing engineers. It is lying. Blatant lies that were fed from Sales and Marketing all the way to their C-Level. That is what happen when a successful company were driven by Sales and Marketing people.

acqq
> P4 was an engineering mistake.

The story I've heard was that P4 was a management mistake. Some engineers claimed they surely understood that the limits existed, but the management believed that the only thing that's sellable is a bigger number of GHz. So the engineers got the goal to make a processor architecture for 10 GHz, and apparently P4 was made for that (long pipelines) and it was known that on lower number of GHz the architecture performed worse. But 10 GHz was unachievable because physics can't be cheated (the chips would simply melt without some special cooling that was also not sellable), so P4 architecture was never driven with the clock speeds for which it was designed.

bdefore
Really enjoyed that video, all hour and a half of it. Thank you!
Causality1
I think people also need to accept the fact we're never going to engineer a solution to human stupidity. If our society doesn't raise people to be good users, no amount of hardening will stop systems from being compromised.
gear54rus
Agreed. Wasting countless hours on the engineering end doesn't seem wise when those same hours could be spent doing something meaningful. This requires education on the user end. Education that will help them understand the world around them better so everyone wins in the end.
pas
Those are wasted hours. We have to make stuff maximally safe AND educate users, because even that maximal safety/ergonomics/etc. is too little compared to how complex our world is.
XMPPwocky
Like what?

I've written in-depth about why "don't click untrusted links" is unhelpful: see https://xmppwocky.net/blog.py?page=22

Nullabillity
That domain isn't resolving.
ZiiS
It is also and untusted link, so I feel could only be preaching to the choir.
Nullabillity
Right. Heh.
t-writescode
Oh yes, totally. I have complete respect for Intel’s engineers. It’s their business department I’ve long had trouble with, but those engineers do truly astonishing, near zero bug work.
spectramax
I think the culture is improving despite of cashing it out in the 90's and 2000's during the Ballmer / Jack Welsch era. Semiconductor industry hasn't done a good job of cultivating a good, positive company culture and making it less regimented. We need to make it cool to work in a semiconductor company. I can assure you that there are so many incredible challenges that exist - with good definition/funding to tackle them from materials to marketing (direly needs to revamped). Imagine running a portion of the world's most advanced manufacturing processes. Ultra high volume (thousands of wafer starts per week), mind you.

Perhaps videos such as these are shining a positive light on the entire semiconductor goodwill: https://www.youtube.com/watch?v=f0gMdGrVteI

minipci1321
> We need to make it cool to work in a semiconductor company.

I am not optimistic. "Semiconductor company" ==> high capitalization (expensive processes) ==> fertile ground for lawsuits and attacks.

As long as a slightest mistake has a potential to turn into multi-million-loss lawsuits, it will never be cool. Is it easy to be cool walking a tightrope between two skyscrapers? Not for many I'd guess.

BTW it is the same for non-semi companies (like OEMs), except that volumes of a given single product are generally lower, so mistakes are more bounded.

aidos
Wow. Those machines are something else altogether. I watched a chip making clip a while back and I recall that they talked about having to make strange patterns in the stencils (not sure what they’re called) to work around the way the light twists when it actually hits the die. I’ve probably got most of that wrong but is that still a thing with the addition of the water layer?
saganus
Sounds like this video: Indistinguishable from magic: Manufacturing modern computer chips

https://youtu.be/NGFhc8R_uO4

Just amazing. I wish a newer video explaining more recent technologies was available.

spectramax
Yeah, they calculate the transfer function of the pattern and then pattern the mask to achieve the end result printed on the silicon substrate. With EUV lithography, it substantially reduced this pattern complexity.

Jim talks about it at this mark in this presentation: https://youtu.be/Qnl7--MvNAM?t=784

voxadam
> Perhaps videos such as these are shining a positive light on the entire semiconductor goodwill: https://www.youtube.com/watch?v=f0gMdGrVteI

That's a brilliantly fascinating video. Thanks for the link.

oarsinsync
> We need to make it cool to work in a semiconductor company.

I'm not sure I agree. If by 'cool' you mean along the lines of receiving respect and admiration from others because, while they don't really understand the actual specifics of what you do, they have a general idea and it's celebrated widely, then I don't think we want that happening in semi space.

It looks to me that it's happened already in software development, and I don't think the end result of that is net-positive.

I think that when people are passionate about what they do, and do it because they're passionate about it, the end results tend to be better than if they're doing it because other people will give them recognition.

I'm not saying this is something that happens all the time, and you can argue that I'm being elitest, snobby, or annoyed that the 'cool kids' have come into my playground and are playing with my toys, all of which would be valid positions to take. I do think that with all communities, they're very different when they have smaller appeal to when they have wider appeal.

Niche communities are not without their problems (and the toxicity that can happen in them cannot be understated), but I personally prefer them, and believe the output is generally of a higher quality, than communities with wider appeal.

spectramax
I know what you mean - I think some of the toxicity and heavy handed management style can certainly improve.
Do you have links to those talks?

Here's what I found: https://www.youtube.com/watch?v=Nb2tebYAaOA&t=1805s (94min, jump to 30m for Mr. Keller discussing Moore's law and his interpretation of it.)

Jim Keller: Moore's Law, Microprocessors, Abstractions, and First Principles | AI Podcast Lex Fridman, host.

ncray
Here's another one: https://eecs.berkeley.edu/research/colloquium/190918

Moore’s Law is Not Dead Wednesday, September 18, 2019

ksec
I think that is the one people should watch. The 1000x1000x1000 atoms analogy. But then again he hasn't touch on the economic issues of Moore's Law
Watch the interview, you won't regret it: https://www.youtube.com/watch?v=Nb2tebYAaOA
petermcneeley
I found this interview to be very entry level.
The amazing thing about Code is how it traces the connection of formal logic (in the Aristotelian sense) to the, as you say, pre-computer code of braille and even flag signals to form the foundations of modern computing.

I am a self-taught developer and probably had 10 years experience in web development when I first read Code. I would have these little moments of revelation where my mind would get ahead of the narrative of the text because I was working backwards from my higher level understanding to Petzolds lower level descriptions. I think of this book fairly often when reading technical documentation or articles.

I recently listened to Jim Keller relate engineering and design to following recipes in cooking [1]. Most people just execute stacks of recipes in their day-to-day life and they can be very good at that and the results of what they make can be very good. But to be an expert at cooking you need to achieve a deeper understanding of what is food and how food works (say, on a physics or thermodynamic level). I am very much a programming recipe executor but reading Code I got to touch some elements of expertise, which was rewarding.

https://youtu.be/Nb2tebYAaOA?t=1351

In this thread: When your whole life is based on assumptions of 98% of it is wrong, you're gonna defend yourself to the tooth. It is emotionally challenging and damaging to the ego.

Watch Jim Keller (designer of A4/A4 Apple chips, Ryzen processor, x86 spec co-author and a legendary chip designer) make this point better than I can [Video link copied at 1:22:34 marker]: https://youtu.be/Nb2tebYAaOA?t=4954

People would say that the author of this post is arrogant and want to reject the status-quo - But I would say the opposite, people who have vested interest in the status-quo because their reputation depends on it, their salary depends on it are the arrogant ones because they reject reality in favor of their own good.

I've been watching Lex Fridman's youtube podcast and there is a recent interview with Jim Keller [1]. Keller is a chip designer famous for his involvement in multiple chips at Intel, AMD, Apple and he was co-author of the x86-64 instruction set. He also worked for Tesla.

There is a point in the conversation where Lex and Jim clearly disagree about how "easy" self-driving AI should be. Lex is clearly pessimistic and Jim is clearly optimistic. I have to admit I was more swayed by Lex's points than by Jim's, but it is hard to discount someone so clearly (extraordinarily) expert and working directly in the field.

1. https://www.youtube.com/watch?v=Nb2tebYAaOA

theresistor
Jim Keller went back to Intel in 2018.
reggieband
My mistake - I should have checked his bio rather than assume based on the content of the discussion. I've updated my comment to change his association to Tesla to the past tense, Thank you.
Feb 14, 2020 · 1 points, 0 comments · submitted by usrjph
I highly suggest viewing one of the Alpha's original designers, Jim Keller, recent chat with MIT Prof Lex Fridman https://youtu.be/Nb2tebYAaOA I was stunned by Jim's total command of Computer Architecture, and his deep insights about the subject. I will never again utter the phrase 'modern computer' without requisite awe.
Tsiklon
Jim Keller and the teams he leads/enables has their fingerprints all over modern CPU architecture - From AMD's K8 (Athlon 64) and Zen (their modern competitive product) to Apple's A4 and A5.
Feb 08, 2020 · 70 points, 6 comments · submitted by JabavuAdams
code_biologist
I found this interview very frustrating to listen to. Lex keeps asking questions about magic, consciousness, AI, and all that stuff... and Jim's a chip designer. Beyond that, Jim doesn't seem very forthcoming about his thoughts on novel architectural improvements other than that Moore's law will hold.

For those interested, a low level technology interview that I scratched the itch I hope'd the Jim Keller one would: https://oxide.computer/blog/on-the-metal-1-jeff-rothschild/

On this episode of On the Metal, we interview Jeff Rothschild. Jeff has had a fascinating journey solving all sorts of fun problems at various levels of the stack. He is most widely known as being a co-founder of Veritas Software and the first VP of Engineering at Facebook, but his story does not start there. Join us as we hear Jeff’s stories from his impressive technical endeavors including disassembling MS-DOS, editing machine code in an octal editor, trolling coworkers in error messages, the origin story of ftruncate, and more.

libertine
Maybe you're just not the right audience for that interview, and that's fine.

If you listened to the interview Jim isn't just a chip designer, he is an extremely well educated and well informed person, that has learned a lot throughout the years on it's own and with great minds.

JabavuAdams
Jim doesn't emote much, but I found he kept dropping gold nuggets of experience. Not necessarily about chip design -- but that's not what I was looking for.
lidHanteyk
I agree: I'm finding Keller to be very closed-minded in his worldview. He has "cooked-rock" thinking; his idea of computation is completely oriented around the physical design of silicon circuitry, and he's completely forgotten how surprising it even is that computation can be physically embedded. I'm going to listen to the entire interview, but this is definitely a disappointing listen.

Edit: Feeling a little ill. Asked whether he feels responsibility for the social impact of smartphones, he says that there's literally millions of folks like him working on technology like his, and that if he didn't do it, somebody else would.

matt-snider
I personally like that Lex asks things like that. He has a pretty unique style which sometimes gets you some philosophical questions, but he also doesn't take things too seriously, so I think even a humorous response would do the trick.

I found Jim's responses to be at times rude or dismissive and thought that was a bit disrespectful.

nujabe
I agree, Lex is overall a very competent interviewer who clearly spends a lot of time prepping before each interview (for example his interview with Stephen Kotkin) but he should really reconsider the value of those deadend philosophical questions he asks every guest (what is the meaning of life? Do you think we will have AGI? Will AI eventually end human civilisation?). I have yet to see an interesting conversation follow from those questions, and quite frankly I don't find them interesting at all, certainly not 10-15 minutes of each interview.
Feb 06, 2020 · 2 points, 0 comments · submitted by tomiplaz
Feb 05, 2020 · 5 points, 0 comments · submitted by localhost
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.