HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Why Tesla removed Radar and Ultrasonic sensors? | Andrej Karpathy and Lex Fridman

Lex Clips · Youtube · 241 HN points · 2 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention Lex Clips's video "Why Tesla removed Radar and Ultrasonic sensors? | Andrej Karpathy and Lex Fridman".
Youtube Summary
Lex Fridman Podcast full episode: https://www.youtube.com/watch?v=cdiD-9MMpb0
Please support this podcast by checking out our sponsors:
- Eight Sleep: https://www.eightsleep.com/lex to get special savings
- BetterHelp: https://betterhelp.com/lex to get 10% off
- Fundrise: https://fundrise.com/lex
- Athletic Greens: https://athleticgreens.com/lex to get 1 month of fish oil

GUEST BIO:
Andrej Karpathy is a legendary AI researcher, engineer, and educator. He's the former director of AI at Tesla, a founding member of OpenAI, and an educator at Stanford.

PODCAST INFO:
Podcast website: https://lexfridman.com/podcast
Apple Podcasts: https://apple.co/2lwqZIr
Spotify: https://spoti.fi/2nEwCF8
RSS: https://lexfridman.com/feed/podcast/
Full episodes playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4
Clips playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41

SOCIAL:
- Twitter: https://twitter.com/lexfridman
- LinkedIn: https://www.linkedin.com/in/lexfridman
- Facebook: https://www.facebook.com/lexfridman
- Instagram: https://www.instagram.com/lexfridman
- Medium: https://medium.com/@lexfridman
- Reddit: https://reddit.com/r/lexfridman
- Support on Patreon: https://www.patreon.com/lexfridman
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.

  Especially compared to whoever at Tesla authorized removing radar and parking sensors.
https://www.youtube.com/watch?v=_W1JBAfV4Io
nikau
Their crappy auto wipers and auto highbeams haven't dissuaded customers, so why not crappy parking sensors.
Oct 30, 2022 · 241 points, 471 comments · submitted by shekhar101
petilon
I didn't find his answers particularly convincing. His answer focused on costs mainly, and how "the best part is no part". We have already seen multiple accidents caused by camera's limitations [1] which would not have happened if Tesla used Lidars.

Cameras have poor dynamic range and can be easily blinded by bright surfaces. While it is true that humans do fine with only eyes, our eyes are significantly better than cameras.

More importantly, expectations are higher when an automated system is driving the car. It is not sufficient if, in aggregate, self-driving cars have fewer accidents. If you lose a loved one in an accident where the accident could have been easily avoided if a human was driving, then you're not going to be mollified to hear that in aggregate, fewer people are being killed by self-driving cars! You'd be outraged to hear such a justification! The expectation therefore is that in each individual injury accident a human clearly could not have handled the situation any better. Self-driving cars have to be significantly better than humans to be accepted by society, and that means it has to have better-than-human levels of vision (which lidars provide).

[1] https://www.youtube.com/watch?v=X3hrKnv0dPQ

lostsock
That video is from 2020, but Tesla didn't remove radar until 2021. Meaning that the crash occurred with radar still active, which I feel just backs up what Karpathy was saying.
petilon
Well, the car may have had radar hardware but there are questions as to whether the software was using it:

https://www.nytimes.com/2021/07/05/business/tesla-autopilot-...

Excerpt:

Mr. Rajkumar of Carnegie Mellon, who reviewed the video and data at the request of The Times, said Autopilot might have failed to brake for the Explorer because the Tesla’s cameras were facing the sun or were confused by the truck ahead of the Explorer. The Tesla was also equipped with a radar sensor, but it appears not to have helped.

“A radar would have detected the pickup truck, and it would have prevented the collision,” Mr. Rajkumar said in an email. “So the radar outputs were likely not being used.”

https://www.nytimes.com/2021/12/06/technology/tesla-autopilo...

Excerpt:

Tesla later said that during the crash, Autopilot’s camera could not distinguish between the white truck and the bright sky. Tesla has never publicly explained why the radar did not prevent the accident.

dmix
> Autopilot’s camera could not distinguish between the white truck and the bright sky.

Is this a hard limitation to vision systems? Or are they saying in that particular early version it couldn't?

(And yes I understand LIDAR wouldnt be limited by the color and would see the objects)

lostsock
Because radar is not good at picking up stationary objects and/or has to filter them out:

https://www.wired.com/story/tesla-autopilot-why-crash-radar/

mensetmanusman
I have some experience with LiDAR, they fail easily if a water droplet is on the cover or if signs are too bright. It’s a whole different technology challenge.
imglorp
Why is LiDAR still expensive enough for the cost to be a problem for anyone? Why have large numbers not driven these down to commodity devices at this point? Or maybe similar tech at another frequency like spread spectrum microwave with phased array semiconductor antennae?
petilon
It is now cheap enough for GM and Volkswagen:

https://www.bloomberg.com/news/articles/2022-10-25/lidar-sen...

ec109685
The numbers aren’t that large. Nowhere near the volume of optical components sold, which benefit from continual optimization that the billions consumer electronics sold each year undergo.
hwillis
> Why is LiDAR still expensive enough for the cost to be a problem for anyone?

Because economy of scale is not the only reason things are expensive. Lidar with 2cm distance resolution has to detect light and measure time to less than 60 picoseconds. It requires sophisticated, expensive sensors like avalanche diodes and complex and extremely fast circuitry.

Consider what a single pixel of a lidar device is doing. It's operating 8 orders of magnitude faster than a normal camera pixel. It's detecting an incredibly small amount of light; the spot of a weak laser on a questionable surface from tens of meters away. Its doing that in the presence of light that is tens or hundreds of thousands of times more powerful than the reflection it's actually looking for. Much of that reflection is even at the same wavelength of the laser!

Even the mechanics of the device are critical. Pixels are gathered sequentially, unlike in a camera, so any vibration makes the data uselessly fuzzy. Vibration in a camera is correlated extremely tightly between millions of pixels. In lidar its correlated with dozens to hundreds. Any high frequency vibration has large negative impacts on the data.

> Or maybe similar tech at another frequency like spread spectrum microwave with phased array semiconductor antennae?

...that's just radar. Literally. Microwaves are 300 Mhz to 300 GHz. Automotive radar is like, 80 GHz, +/=5 GHz (mostly). It uses phased arrays. It uses highly integrated devices. They don't use semiconductor antennas, because the area on a silicon die is far, far to valuable to use for an antenna, but the antennas are incredibly cheap. Radar and lidar are just pretty different.

amelius
The problem with self-driving is that it is based on data, but the environment may change. See e.g. the case where Tesla thinks that firetrucks are roads.

So if fashion changes, pedestrians may suddenly look like road too, as just an example.

Another problem is that state-of-the-art classification networks have an accuracy in the 90% range. Given that a car has to take hundreds of decisions in a single ride, then even if the accuracy was 99%, you see that error rate simply gets too high.

martindbp
> state-of-the-art classification networks have an accuracy in the 90% range.

If you're referring to ImageNet SOTA, is has 20000 different classes, including 120 different dog breeds [1]. This is a vastly different task than reliably detecting pedestrians where Tesla can actively curate a dataset of hard examples (from their fleet), whereas ImageNet is fixed, sometimes with low quality labels and as few as a couple of hundred examples. Tesla can also pick a point on the ROC curve to give higher recall but more false positives (which is important for VRUs specifically). Another big factor is that Tesla is using video, not still images, which makes predictions even more robust.

And that's just for pedestrians, Tesla are also using a general ViDAR (visual LiDAR) which is trained to detect obstacles that do not have a specific class. The ViDAR again operates on image sequences, not a single image, and can thus pick out structure from motion.

[1] https://en.wikipedia.org/wiki/ImageNet

akira2501
> While it is true that humans do fine with only eyes, our eyes are significantly better than cameras.

They also have better failure modes and a really sophisticated error management system. They are susceptible to optical illusions, though.

> It is not sufficient if, in aggregate, self-driving cars have fewer accidents.

This is the incorrect analysis anyways. This was always going to be true because a large portion of accidents are single vehicle accidents where the driver was at fault for the crash. Usually due to speeding, alcohol, youth, or a combination of them.

If they didn't have fewer accidents then something is very very wrong with the entire idea. Which may very well be the outcome here. Looking at multi-vehicle accidents where there was no fault of the driver who died, it's not clear that an automated system driving the car would have saved them.

Roads are built right next to cliffs and bodies of water. Semi trucks can completely destroy your vehicle in an instant. Large accidents on snowy or foggy highways happen. Drunk drivers exist and sometimes literally do come out of nowhere, a pickup truck moving at 60mph has enough energy to knock a firetruck onto it's side if you hit it side-on and freeway ramps dump out right onto residential streets. Parts fail, floormats get stuck, people don't wear their seatbelts, and you can get a license to ride on a motorcycle if you want.

It's a guess based on the research I've done, but my expectation is around 20% of fatal accidents can in some way be prevented by automation. You'd honestly prevent more fatalities by putting an ignition interlock on everyone's vehicle or building real barriers between traffic and pedestrians.

oldgradstudent
> but my expectation is around 20% of fatal accidents can in some way be prevented by automation.

Assuming, of course, that automation does not introduce its own failure modes.

That's a strong assumption.

BurningFrog
> a large portion of accidents are single vehicle accidents where the driver was at fault for the crash. Usually due to speeding, alcohol, youth, or a combination of them.

Also plenty of suicides in that group, which confuses the stats.

We really need SDCs to have fewer accidents than human drivers, excluding the suicides.

bergenty
Not to mention waymo works well with LiDAR, cameras and radars. If you’re argument is it’s too hard to deal with that much data, it’s definitely the wrong answer.
jandrese
> While it is true that humans do fine with only eyes

We do not. Humans are terrible at driving. Traffic accidents are one of the leading causes of death in the developed world. Billions of dollars of property damage occur every year because humans are not up to the task. A self driving system that is as safe as an average human driver would be an absolute failure.

croes
>We do not.

But not because of a lack of visual information.

Most of the time it's a la k of concentration or an overestimation of one's own driving abilities.

freejazz
Here is some shocking news for you. It's not your eyes driving the car... its your brain.
croes
That's exactly my point.

Unless Teslas have something similar to the human brain cameras are not enough.

ec109685
Humans are amazing at driving. We typically go millions of miles in a lifetime without causing any fatal vehicle crashes and can generally handle unknown situations just fine.

AI is nowhere near that.

P_I_Staker
We are excellent at driving. It's shocking that there aren't more accidents.
rootusrootus
> Traffic accidents are one of the leading causes of death in the developed world.

Not even close, really. A bit under 1%. You are more likely to die from an overdose, or suicide. And much, much, much more likely to die from cancer or heart disease.

And that is without getting into the trade-offs. Cars at least have a significant utility value, which is not true of suicide, opiate addiction, cancer, or heart disease. We should try to reduce traffic deaths, but we should not lose perspective.

jandrese
I was not precise enough in the wording. Leading cause of accidental death. Obviously it is not beating out old age or heart disease. Doesn't change the fact that a self driving system with the record of a typical human would be considered unacceptable.
rootusrootus
I agree on that point. For self driving to be accepted, it has to be at least as good as a good driver. Drunks, oldsters, youngsters, and the like all push the average down. Many normal people are actually pretty good drivers statistically, and those are (perhaps not coincidentally) the ones most likely to be in a position to afford a fancy new self-driving car.
paulcole
> Traffic accidents are one of the leading causes of death in the developed world.

Perhaps leading cause of premature death or leading cause of accidental death or leading cause in demographics who are otherwise unlikely to die, but they are nowhere close to the top of the overall list.

Nomentatus
You're right to note the advantages of lidar and (narrow range of) contrast problem for cameras (they arent eyes.) This is why the Uber human driver shouldn't have trusted the machine at night, in particular.

But you still have to address his system argument, which was that adding geegaws that added little would actually increase overall risks along the supply chain (plus maintenance) while distracting the team and adding more risk that way, for very little apparent (but only apparent) gain. The team does believe that they'll get to better than human driving, and do that without lidar.

Marazan
They cannot actually believe that.
Nomentatus
There's a new article showing up just today at HN detailing more issues securing supply lines (for libraries and much more) re software; it's a deep problem, not a non-existant problem. But it doesn't have to be malicious actors at work. SpaceX lost a rocket because a supplier's product wasn't built to spec. SpaceX makes that part themselves now, and highly favors vertical integration, which just means making your own sh*t. Just maintaining documentation is a bear. There's another article on HN today detailing an Air Astana near crash because of "improvement" maintenance changes (with poor documentaion) that went very badly wrong.
mensetmanusman
LiDAR has its own false positives that haven’t been solved, so it’s possible they believe that.

I certainly was skeptical until alphago and alphafold happened.

freejazz
That's especially amusing considering how if you view Go as a visual problem, it's beyond trivial compared to what is required to safely operate a motor vehicle (even just visually).
Nomentatus
Yikes. I'd be a lot better Go player if it was just a matter of seeing the board.
freejazz
It is yikes - what's shocking is that you seem to be advocating for that same exact perspective for... driving a car???
Nomentatus
Nope. There's definitely mapping, and then, much, much more. You're stuffing words in my mouth.
freejazz
Not sure if I'm stuffing so much as removing the bluster
Nomentatus
Ad hominem, but no content.
freejazz
"But you still have to address his system argument"

Do you? It's his argument that he needs to substantiate... it's not my burden to confirm his conjecture. And even looking at it on its face... it's clearly self-serving bs. It doesn't seem to be a problem for any other car company, so I'm a bit confused as to why it's such a problem for Tesla. Of course, the obvious answer is that Tesla is cheap and doesn't want to pay to have a team that would have sufficient bandwidth to do what every other car company and self-driving system is doing...

Nomentatus
You still have to address the argument he makes that the net effect on risk of keeping the lidar was in fact negative. You're just gainsaying it. That's not a contribution to any discussion.
freejazz
I have no doubt that might be the case; as I already explained, it's obviously due to the fact that they are cheap.
Nomentatus
So you're again gainsaying his explanation, but gainsaying says nothing that helps anyone else make a decision. How are the rest of us who aren't psychic to know he's lying?
freejazz
I'm not gainsaying his explanation. You don't seem to understand that a function of his explanation is what Tesla is willing to invest and their ability to recoup in a sale of the car.
Nomentatus
Oy.
freejazz
Pot, kettle, black.
Nomentatus
Circling, circling, circling - but no specific analysis, citations, or criticisms of Tesla's stated logic. Gainsaying isn't a positive contribution to a conversation.
whiddershins
He's not talking about costs - money. He's talking about costs - engineering.

It's about more information is not always better. It can instead muddy the waters. It can create confusion.

watwut
> It is not sufficient if, in aggregate, self-driving cars have fewer accidents

It would be sufficient if it would be the case. With actual proof.

Reality is that in limited abstract situations, self driving card maybe have some advantages. But, that is all that we can claim. And when self driving fails, somehow human is always the cause.

bumby
I disagree, but mainly because of the way humans perceive risk.

From a public standpoint I don’t think it’s sufficient because there’s inherent trust lacking in an automated system. With ape-driven systems we have a certain amount of trust because we can more accurately intuit what the other ape is reasonably thinking. This is not the case with autonomous driving which leads to a wider amount of uncertainty. Not unlike how we are intuitively less trusting of someone who is legitimately “crazy” even if statistically we don’ can’t say they are shown to be more dangerous.

paulryanrogers
AI cars can also get a software update at any moment. Human drivers won't change behavior en masse overnight.
bumby
You’re right, but that still misses the point. More frequently updated software doesn’t make people intuitively trust it more. For some, it can do the opposite by making them question why it needs to be updated so much to begin with.

Humans don’t generally measure risks statistically. They do so emotionally and with lots of cognitive biases. You don’t alleviate that with more and more facts, unfortunately.

paulryanrogers
Sorry if I was unclear. Software is more unpredictable in my view, both because of fleet updates and it being so fundamentally different than humans. (On the whole, I realize AI and ML could work similar to some human processes.)
bumby
My fault, your “also” should have tipped me off to what you meant.
masswerk
Having seen their AI Day, I supposed this was all about a unified, pseudo-visual voxel representation – and especially about generating scenarios. Apparently these have become a crucial part of the system and generating a broader variety of sensor data would be a considerable liability.
aeternum
The dynamic range is the reason Tesla know counts photons rather than use traditional camera processing. They basically remove the concept of exposure entirely and simply pass the sensor photon counts to the neural net.

This approach not only simpler as it removes photo processing/encoding but the result is that the NN can operate with a very high dynamic range similar to the human eye and in many cases can be sensitive on the single-photon level.

lambdasquirrel
Counting photons won't keep a camera from being "jammed." Unless you are using a physically perfect polarizing filter, such that each pixel on the sensor only receives photons from the exact angular window, traced back through the lenses, you have a camera that can ultimately be "jammed."

The human eye isn't so great on those terms. But humans can raise their hand to block the sun if it's straight at our eyes.

petilon
But it doesn't appear to be helping. Here's an example accident where depth data from Lidar would have helped:

"Tesla later said that during the crash, Autopilot’s camera could not distinguish between the white truck and the bright sky."

https://www.nytimes.com/2021/12/06/technology/tesla-autopilo...

mensetmanusman
Humanity doesn’t know how to solve this yet, so it’s hard to say whether it is helping or not.
Gigachad
We already have the solution. LiDAR.
ALittleLight
If you could use LiDAR well enough then it would solve the problem. Of course, if you could use vision well enough it would solve the problem too.
touch_abs
The big limits of LiDAR are cost, more than anything. There have been dozens of public driving trials where from a functionality level the answer has been positive (apart from traffic lights, the bastards), but nobody wants to buy a solution with a six figure BOM, before integration.
pokerhobo
Lidar also has problems in rain, fog, snow, etc… FLIR would actually be better
mensetmanusman
If LiDAR was a solution, we would have driverless LiDAR based vehicles. No one has solved driverless though.
aeternum
The crash you referenced occurred in 2016 when they were using radar on the cars and I don't believe they were yet using raw photon counts nor did the NN have any voxel-based memory as it does now.
moffkalast
> any voxel-based memory

Haha, any WHAT?

Seriously though do you have any more info on that, it sounds intriguing. Where and how do voxels come into play in a 2D NN?

aeternum
It is pretty cool: https://youtu.be/ODSJsviD_SU?t=4355

They transitioned from 2D to 3D a couple years ago, major transition but it does seem like a critical step. We live in a 3d rather than 2d world.

whiddershins
The replies to your comment don't seem to understand you at all. in the video link here

https://youtu.be/ODSJsviD_SU?t=4424

he clearly states 16x dynamic range as a result of direct photon processing.

emkoemko
how do you count photons continuously? what... this makes no sense, if you pass "the photon count" you just did a exposure... also how does a photo diode count photons?
moralestapia
Does it have electrolytes as well?

Nice tech and single photons and whatnot but it still runs into things that a radar with some really simple code wouldn't. ¯\_(ツ)_/¯

davidgay
> They basically remove the concept of exposure entirely and simply pass the sensor photon counts to the neural net.

That sentence does not make sense. There's no such thing as a count without a corresponding interval that count occurred over. That interval is the exposure.

You can of course do lots of (very) short exposures to avoid sensor saturation. That's "just" a movie at a very high frame rate. And then you can post-process this in lots of exciting ways, align the frames, average them, etc, etc.

aeternum
Yeah that's fair. A CCD sensor basically converts individual photons to electrical charges. What Tesla has said they've done is thrown away all the traditional image signal processing & post-processing which often includes a lot of exposure-related averaging.

You're right though that we don't typically use real-time neural networks that operate based upon spike rate, so an interval needs to be chosen for photon counting which could be considered a kind of exposure and it is critical that the interval be short enough to avoid saturation.

Maxion
Lol this doesn't make any sense. The dynamic range of a fully sunlit California highway during noon in the summer (I.e. the brightness reading of the darket vs the brightest spot) is wayyyy higher than any existing sensor. You cannot ignore exposure, you have to choose which part of the scene you want within the brightness range that your camera sensor can capture. You will have areas of the scene that clip, in other words areas of the scene that are pure black or white with no data.

You can do bracketed exposures, but that's literally the opposite of ignoring exposure.

aeternum
Just keep the duration low so that you never saturate the sensor even in bright sunlight and let the NN do the summations.

At a fundamental level it is somewhat akin to bracketing except all that HDR processing/frame matching is performed within the NN rather than a traditional image processing stack.

The NN is better at this anyway since it must already be performing camera/pose motion tracking to correlate what it's seeing from frame to frame.

TheLoafOfBread
This whole question about the vision boils down to "humans don't need it so cars should not need it too" the problem with this statement is that humans does not have wheels to move around, they have legs, but wheels are ridiculously simple compared to 4 legs tapping 160km/h on a highway. Same for birds - they also does not need jet engines to fly around, but imagine Airbus A380 flapping its wings and what kind of complexity would you need to flap 800km/h through air.
AtlasBarfed
soooo... you're agreeeing that non-vision isn't necessary since the control domain is so much simpler?

I personally think they should use as much data inputs as possible: radar, IR, LIDAR, mesh networks, fixed route information.

Where tesla went particularly wrong IMO is ignoring some sort of route-based chunk information which is how humans navigate. IIRC Elon said something to the effect of just having an algorithm to work everywhere.

Humans use the basic algorithm "stay in lane, drive forward" and then decorate with signs, knowledge of curves, locations of potholes, dangerous low-viz corners, likelihood of surprise stopped traffic, obscured driveways, general character of neighborhoods, road purpose. Weather. Windy sections, icy sections, light availability anomalies. What type of vehicle. Repair state of vehicle.

A general AI algorithm will never be able to properly account for flavors/tags/chunk info on routes. Especially since cloud precomputation is so available these days.

Anyway, while recognizing that Tesla's "Fully Self Driving" is not as advertised, and we are a ways from self driving for any statistical measure of superiority to a healthy aware adult, it is still damn impressive what FSD vids show.

Do AI driving systems try to make "subsystems" of AI networks to reduce inputs to various higher-level inputs, or do other just throw a ton of inputs at a big ass network and just let the entire system rise from the soup of information?

latchkey
> Humans use the basic algorithm "stay in lane, drive forward"

If you've ever driven in Vietnam, that is so not true.

pclmulqdq
Hell, even in the northeast US (particularly the cities) this isn't true. Self-driving cars today seem to have a dogmatic focus on California-style driving.
113245
The Tesla AI day videos [1] go into some detail about this. They use multiple networks that are dedicated to specific tasks.

[1]https://www.youtube.com/watch?v=j0z4FweCy4M (2021), https://www.youtube.com/watch?v=ODSJsviD_SU (2022)

ajross
But the question at hand is system control, not locomotion! You're not asking the automation to walk (well, I mean someday we will, but Teslas have wheels), nor the aircraft to flap. We want the automation to do what a human pilot would do. And that works with eyes.

No, I think this argument is largely correct. And frankly settled: anyone who's driven recent FSD beta versions knows very well that the cars "see just fine". They don't hit anything, they see and avoid obstacles. Frankly they're much more observant than humans are, my car will twitch when pedestrians turn as if they're going to enter the road (where human drivers mostly don't notice, and if they do they ignore it). What problems still exist are in planning: things like sign reading, lane selection, etc... still need some work. But collision avoidance just isn't an issue. It isn't. The LIDAR folks were wrong, basically.

(I will admit though that I'm a little sad about the removal of the ultrasound sensors though. It's true the autonomy probably doesn't need them, but I really like having the chimes to guide parking and garage maneuvering.)

threeseed
> The LIDAR folks were wrong, basically

I think your mistake is thinking LiDAR exists to solve the happy day scenario. It doesn't.

Vision is sufficient for the majority of use cases. Where LiDAR comes into its own is in the edge cases because it almost guarantees accurate bounding box detection. Which is where vision is at its weakest.

So I want to know what does FSD do when it sees a billboard of a person or when it is seeing a new object for the first time.

mgoetzke
As long as the billboard is not moving into the street or standing in the middle of it what would you expect it to do ?
elteto
> The LIDAR folks were wrong, basically.

This is far, far from settled at this point.

ajross
No, it's over. Look, the LIDAR value proposition was necessarily "Yes, we're outrageously expensive and involve major tradeoffs in physical design of the vehicle, but vision can simply never do what we do". And... vision does. It does, every day. On hundreds of thousands of cars.

In point of fact FSD beta vehicles are out there every day in environments where LIDAR has never been deployed, nor likely ever tested. And we're not seeing clear "it can't do this" failures. Anywhere.

The closest you're going to get to evidence against vision are things like that "It Hit A Kid!" stunt from a few months back that turned out to be basically faked[1].

[1] At least the perps went silent and no one was ever able to reproduce. I mean the whole idea was ridiculous: my car twitches at pedestrians, including kids, including my kids, every day. It literally draws them on the screen.

mynameisvlad
Do you actually have FSD Beta? I do. It's not anywhere near the level of polish that you seem to imply it is. It gets things wrong all the time. Turns are downright dangerous.
archagon
If it was over, Tesla would not be the only company relying solely on cameras.
threeseed
> And we're not seeing clear "it can't do this" failures. Anywhere

Do you have evidence of the number of accidents, disengagements etc by region ?

Because you're making awfully definitive statements about FSD safety.

ra7
Tesla doesn’t release any data. But there are community trackers [1] that puts the disengagement rate in the single digit miles. In comparison, Waymo and Cruise had a rate of roughly 30,000 miles/disengagement according to CA DMV data. That’s how much worse Tesla FSD is.

[1] https://www.teslafsdtracker.com/

fzeroracer
> The LIDAR folks were wrong, basically.

According to who? Tesla? Because Tesla has a vested interest in trying to prove that they're right even if they're obviously wrong. That's why they constantly try to downplay failures, software issues, device issues etc.

I'm very confused by the attempts to discredit the usefulness of LIDAR. It's another tool you can use to improve the accuracy of your model. Sure, you can use a screwdriver, flip it around and use it as a hammer. But if you need to deal with nails, it's better to grab a hammer instead.

ajross
The causality goes the other way. The LIDAR claim was that Tesla's vision approach couldn't work. It did.
fzeroracer
Who made this claim? You keep making these grand statements but without linking to people or providing any proof.

Like there are going to be some environments where Tesla's vision is going to struggle, that's just a fact because you're relying on a more linear set of data. That's why you incorporate as many data points as possible for reasons other commentators have brought up. And I'm confused by how you qualify it as 'working', given we've seen multiple issues which are directly related to their vision approach.

fooblaster
Tesla has not solved self driving, and by all accounts never will with their existing compute and sensor stack.
martindbp
Let's do a thought experiment: if Waymo could have seen 10 years ago how well FSD perception works today, would they have invested so heavily into LiDAR? Maybe the answer is still yes, because with the low volumes of vehicles they have they can afford to put it in, but it's not clear cut. If you could show FSD to ML/CV engineers 10 years ago their minds would have been completely blown. My mind is still blown by how well it actually works.
ra7
Tesla’s vision approach doesn’t work. There is a reason it requires a driver to actively prevent crashes and has a disengagement rate in the single digit miles [1].

All over this thread you keep making grand statements that “it works”, which is just completely false. It’s simple — if it worked, there would be no driver.

[1] https://www.teslafsdtracker.com/

Slartie
> my car will twitch when pedestrians turn as if they're going to enter the road (where human drivers mostly don't notice, and if they do they ignore it)

As long as those pedestrians DO NOT actually enter the road after those turns, any "twitching" of your car in response is an ADDITIONAL SAFETY PROBLEM, because other drivers might notice the erratic movements of your car and do erratic things as well, which in the end might result in accidents that wouldn't have happened had your car not "twitched".

Especially "twitchy" AIs like that of your car might very well "re-twitch" on noticing your car doing small, but erratic and rapid changes in behavior, thereby initiating a "twitch escalation spiral".

freejazz
Disregarding everything else about your post, which was better addressed by others, I'm amused that you think the FSD being twitchy reflects safety.
Gordonjcp
The thing is, they don't see as well as humans. They don't respond to changes in the environment until a car is actually in the middle of changing lanes.

It's like being driven around by a drunk person - the reaction happens loooooong after the action that causes it has started.

jeromenerf
> We want the automation to do what a human pilot would do. And that works with eyes.

Humans can’t really turn senses off, so they have coffee when driving. Touch and hearing are quite important to “read the road”. Equilibrium too.

johnwalkr
Humans work with much more than just eyes. We subconsciously move our heads in case of uncertainty in the stereo vision algorithm and have pretty good IMUs. And yet, everyone has the experience of wrongly focusing on a repeating vertical pattern (vertical blinds or coiled cord for example) and getting disoriented. And every experienced driver has experienced at least some of the following: a moment of glare from a wet road, driving into a sunset, snowy road with curves in flat lighting conditions, dirty windshield/headlights/backup camera.

All of those are challenging for humans and and probably even more challenging for computer vision with cameras only. But except for the last point, all are obviously improved by lidar.

wnkrshm
A good human driver also gauges the limits of their own experience and 'phase transitions' into a more cautious mode of driving.

Is that something the algos can do? Infer the familiarity of the situation?

mgoetzke
The planning absolutely should take that into account and it does somewhat already. When it is really tight for example, it can slow down to a crawl.

It should do that in many cases, wet roads, busses with open doors, busses in general maybe, blind corners (does that already to some degree), many people nearby etc.

clouddrover
> You're not asking the automation to walk

Tesla should aim for parking first. Teslas do poorly at self parking:

https://www.youtube.com/watch?v=nsb2XBAIWyA

enragedcacti
> No, I think this argument is largely correct. And frankly settled: anyone who's driven recent FSD beta versions knows very well that the cars "see just fine". They don't hit anything, they see and avoid obstacles.

Only if you ignore times where intervention stopped it from hitting something, times where it did actually hit something, massive amounts of jitter and popping in the visual output, phantom braking, etc.

Unless of course "recent" means n+1 where n is the version that crashed into something.

Collision with bollard in Feb 2022: https://www.youtube.com/watch?v=sbSDsbDQjSU

attempts to plow through cyclist Feb 2022: https://www.youtube.com/watch?v=a5wkENwrp_k

almost crashes into tram (can't gauge speed or direction?) Jun 2022: https://www.youtube.com/watch?v=yxX4tDkSc_g

Crashes into curb Aug 2022: https://youtube.com/shorts/8Mh1GjejdsI

Phantom brake Sep 2022: https://www.youtube.com/shorts/5v6j_oL7S-g

Almost colliding with bridge pillar 2 weeks ago: https://www.youtube.com/watch?v=5CMYkDWaqn0

Crashes into various objects in testing 2 weeks ago: https://www.youtube.com/watch?v=yyDxqEzV5Zc

codeflo
Exactly. The way biology solved something may not always be the best way to do it with technology, because the constraints or so different. And to be more blunt, I think none of the problems where technology surpassed human performance were achieved by doing it the exact same way. From locomotion (legs vs. wheels) to playing chess (strategic intuition vs. billions of calculations).
m463
> "humans don't need it so cars should not need it too"

I think of parking and I'm reminded of "the camry dent"

https://duckduckgo.com/?q=the+camry+dent&iax=images&ia=image...

yarg
Human binocular vision is what has been used to drive cars up until now, so it can be done (with a few thousand million years of iteration).

Ideally cars will be self-driving using only passive sensors - but I do think that Musk/Tesla completely missed the value of active sensors in training.

gibolt
Pretty sure humans haven't been striving for drivers licenses for millions of years...

Tesla does use Lidar on a small number of test vehicles for assessing ground truth. However, they have built enough of a data pipeline and fleet data acquisition to use repeat clips to determine ground truth better than human labelers.

tibbydudeza
But the "system" is so adaptable from bipedal locomotion, spotting predators or prey and identifying unfit food, figure out social hierarchy, human facial expressions that driving a car is easy.
Nomentatus
No, he precisely said that the difference Lidar made was tested, and the delta (difference made) was quite small; not enough to outweigh the downsides. Elon has noted that humans do well, and that's relevant, but that observation was also tested, re lidar.
NBJack
Basically everything he said as a justification (sourcing, firmware, etc.) applies to every sufficiently advanced part of the vehicle. By that logic, they should not be using touchscreens on the center console, etc.
Nomentatus
If the safety delta was low after removing them, as 'tis with Lidar, absolutely.
paulryanrogers
It's also possible their LIDAR implementation was poor. Waymo uses LIDAR and has fewer incidents per mile.
Nomentatus
Good point. I'm pretty sure it was poorer, Waymo really goes all-in on Lidar, they still top-mount it don't they? Waymo is also solving a much more constrained problem, with far fewer vehicles, so fewer accidents doesn't surprise me.
freejazz
They are doing exactly what Tesla isn't willing to do and are being responsible in exactly the way Tesla isn't.
Nomentatus
But it's an expensive dead-end for most purposes, esp re the insane mapping. Ok for city robo-taxis. Doesn't solve the real problem to be solved.
freejazz
Cars drive in cities, last I checked.
Nomentatus
I guess you're saying that "cities are mapped, dude." But not mapped in anything like the way Waymo maps for this purpose: if I heard right, yesterday, mapped to the centimeter and very frequently updated to deal with any changes of any kind!
freejazz
Driving in cities is a real problem that will really need to be solved.
gibolt
Waymo has HD maps that require regular 'trawling', just like Google Street view. They are also very conservative in turns, generally avoiding unprotected lefts. They also are much less human-like.
croes
>and the delta (difference made) was quite small

But why. Because LIDAR doesn't help much in general or because the Tesla engineers aren't good at using the sensor data?

Same with the manufacturing.

Sounds to me like Tesla can't handle complexity. And if they can't handle the complexity of manufacturing, they surely can't handle the complexity of full autonomous driving.

Nomentatus
I don't think interpreting the lidar data and integrating it is a super-tough problem, it kinda comes in 3D, unlike stereoscopic vision. So I take it this means that the Lidar data rarely differed and it rarely mattered when it did.

Elon's companies have a long history of handling complexity very well (even, you know, actual rocket science) precisely because they relentlessly simplify everything they can. Raptor one is more complex that Raptor two, but I'll take the latter any day. Nobody else has a full-flow rocket engine. Many previous attempts were swallowed by the complexity of the task. Even Raptor one looks like a rat's nest - but unlike other attempts, it worked.

Tesla's manufacturing margins are far out in front of any other car company (see David Lee On Investing podcast.) Having simpler, larger parts made by much larger pressing machinery is a big part of why. Looks like they are (now) handling the complexity of manufacturing very well.

dmix
> This whole question about the vision boils down to

Is that really what the problem boils down to? Or how it was decided? Or are you just questioning a common meme that comes up in internet debates about car AI?

elteto
More importantly, we have a tremendous data "engine" processing input from our senses. So assuming for a second that cameras match what our eyes can do, you still do not have a processing engine on the level of our brain to make sense of those inputs.
woeirua
Andrej's argument about more sensors adding entropy strikes me as disingenuous considering that in the next question he then says that Tesla's biggest advantage over everyone else is "the fleet", which clearly introduces orders of magnitude more entropy into the system than anything else. Can you imagine the infrastructure required to gather video from "the fleet" anytime a car sees something unexpected? How about diagnosing what went wrong in that specific instance? How many thousands of these cases do they see everyday?

Given the progress of the FSD "beta" to date, and the fact that Andrej _left_ Tesla, I'd wager that he knows that this approach is a dead end, but he won't say that because he'd get himself in hot water with Elon.

ralfd
Assuming Karpathy is lying is quite a hot take to disregard his opinion.

No. He makes it clear that he is very convinced about it. There is no relativism, no weasel words or couching in maybes. He could be wrong, of course, but he believes in what he is saying.

woeirua
Here's the thing though, if you're Karpathy and you are 100% confident that Tesla's approach is on the cusp of delivering full L5 autonomous driving, then why leave? Surely, Tesla would become the most valuable company overnight if they could actually do it. He would be showered with accolades when they finally finish it. To leave, _before_ any of that happens, says it all.
nicbou
> no weasel words

The video starts with him reframing the question instead of answering it

epgui
I regularly reframe questions when I think they're not the most interesting. It's a very common thing to do, especially in academia, that doesn't particularly indicate deception.
pyinstallwoes
Your comparison is a little short sighted because your example would require fleet * additional sensor entropy rather than fleet on itself. And if fleet on itself is adequate then anything extra is simply inefficient at best.
woeirua
No, my point was that the fleet itself adds entropy to the system because it creates a lot of noise for engineers to track down edge cases and other weird things that happen when you're collecting data from a lot of "sensors", i.e. cars, at once. The exact same argument Karpathy made about why they dumped ultrasonics and radar.

The problem is that so far, Tesla has yet to demonstrate that the fleet _is_ sufficient. IMO, if the fleet was enough to get to L5 autonomous driving, then they would already be there.

Nimitz14
You don't understand the term entropy.
dmix
One is infrastructure entropy and the other is software engineering entropy (we're still talking about data from one type of sensor, just at larger scale).

Most tech startups have 10x+ more problems with the engineering part than the infrastructure/ops part.

Also this is one person's perspective from a large team. His answer might be biased because he's an engineer and I doubt his was the only voice in the debate.

dreamcompiler
It's obviously a stupid decision to remove a direct source of range data (radar and ultrasound) in favor of an indirect one (vision).

But on second thought this doesn't bother me that much because Tesla FSD is absolute garbage even with radar (and I don't think Tesla will get away with selling the FSD snake oil for much longer), so if vision-only is good enough for the base-level lane-keeping autopilot functionality and it makes the cars cheaper, maybe that's a good thing.

epgui
Even if you were right, there's nothing "obvious" about that.
dmix
Even though the risks are high the outcomes will always keep them honest. Whether they like it or not.

This isn't like Facebook continually releasing a product that sucks but people will use anyway.

Tesla is constantly working against the clock and everything they do has real world consequences. There are multiple gov agencies watching over it at all times. Of course there's lots of people with far higher risk tolerance than is being exhibited but if it does turn out badly IRL this will get shut down pretty quickly.

The good news is Tesla has the ability to cripple this feature remotely without a costly/lengthy recall if that does happen.

AlotOfReading
Outcomes have to be truly and utterly fucked for "new" types of problems to be noticed in the automotive industry. Take the Toyota Unintended Acceleration case, where completely negligent software quality took over a decade and at least 89 lives to be noticed and (partially) rectified.

Regulators have keen noses for very particular types of issues and rely heavily on manufacturer judgements on a lot of the rest. Issues that aren't in any of those fairly narrow categories need to be extremely public or extremely egregious to attract their notice.

dmix
I’ve read into the Toyota case in the past and it doesn’t seem like it was that obvious of a regulatory failure, where you could come to such a conclusion.

Did I miss some good article covering it? Because everything I’ve read with the benefit of retrospect has been pretty critical of a lot the alleged ‘cases’ reported in that 89 number.

Are you sure that’s the example you want to use for pre-emptive regulatory failure? And do you think Tesla could demonstrated failures at a similar rate (assuming they were legit) and get away with it? Because I dont

9935c101ab17a66
Somewhat related: I read into the case about a year ago and was shocked at how little coverage there is. There isn’t even a dedicated Wikipedia article — there’s only a one paragraph summary on the general article “Sudden Unintended acceleration” that covers all causes and prominent cases. 89 people died, and Toyota knew about the problem for years! Not only did they not do anything, they hid it. I just don’t get why more people didn’t talk about it. It’s not like it all happened in the distant past — Toyota was fined by Justice dept in 2014!
P_I_Staker
While that might be the case, Toyota has been found very negligent in how they develop software. It's an epidemic in the industry.

I think OP is just marveling at how so much can be so obviously wrong, and yet not garner attention and criticism. They should be exposed when they audit the software process, and really taken to task.

I think OP might be acting a little unrealistic, in that it seems like the world just works this way. However it's quite shocking to see the level of carelessness that can go into critical software.

AlotOfReading
I think it's a good example. Regulators had failed to note the issue despite years of reports (even considering the typical noise). NHTSA closed multiple investigations on the subject and even downplayed their findings when the investigative reports were released.

I used it as an example mainly because it's so public and the underlying causes were so egregious, not to make a specific comparison with Tesla's behavior.

I've actually filed whistleblower reports in the past (not at companies I've admitted a public relationship with on this account, if you want to check) that didn't lead to anything as far as I'm aware. The bar for investigation is apparently higher than my personal limits.

nelox
“The world is designed for human visual consumption” and “[vision] has all the information you need for driving”. While vision may be sufficient, I would say that other senses, such as hearing, touch and smell augment driving very well. Especially with regard to situational awareness. e.g. The sirens of emergency vehicles are typically the first indication of their presence, which often can be felt as well. The wail of a tornado siren, similarly. Loud throbbing motorcycles do much to improve rider safety simply due to them broadcasting their presence. At railway level crossings, drivers should slow down, look and listen for oncoming trains. The smell of wildfire or bushfire smoke provides enhanced warning or nearby danger. So to say vision is sufficient, does not fully take in to account the driver experience, especially where safety and situational awareness are concerned.
djleni
THANK YOU. Reading this thread was getting to me because so many comments say humans drive eyes only.

I use far more than just vision driving:

- sound, for emergency vehicles, detecting vehicles outside of my field of view if my windows are down or the vehicle is loud, tire sound (especially in snow and rain), engine sound (more feedback in snow or ice about what my tires are doing)

- touch (steering feedback, gives information about grip in some circumstances)

- acceleration (can feel if the rear tires break loose in a turn on snow or ice, or if I’m sliding while breaking)

And probably many more

mola
We even use Doppler! Our hearing is capable of sensing movement (speed and acceleration) using the Doppler effect. Our hearing also has a remarkable ability to locating sound source incoming direction.
AlotOfReading
It's worth noting that most autonomous vehicle solutions have dedicated microphones for emergency vehicles and sensors that can detect slip. I had a little dashboard measuring wheel slip at <company>. It mainly ended up mapping train tracks and potholes.
leoh
where company := gm ?
AlotOfReading
I've worked for multiple companies in the industry.
DebtDeflation
Do they detect the sound of a motorcycle coming up on them? The sound of a car near you with a flat tire which simultaneously tells you that they will likely make sudden movements to pull over to the shoulder and also to be aware of road hazards that can flatten your tires as well? The sound of brakes locking up tires several cars ahead which tells you to slow down even before the brake lights on the car in front of you illuminate? The sound of detritus hitting the underside of your car which tells you there's likely a larger object ahead that you can't see? There are all sorts of sounds that you're constantly hearing ans subconsciously reacting to.
stonogo
One thing that Tesla engineers seem to keep forgetting is that human eyes are not fixed into a steel block. We can tilt our heads, crane our necks, hold hands up to block glare, and so much more. Human vision is responsive, adaptable, and not comparable to some cameras bolted to a car.

And even with all these advantages, tens of thousands of people are killed in car crashes every year. Some people make a compelling argument that this is evidence that human vision doesn't have all the information you need for driving. While I don't go that far, I do think autonomous driving has a long way to go.

hbarka
Intuition + other examples tell you that radar and ultrasonic sensors work. Why do we twist ourselves to believe otherwise?

Elon removed the radar and ultrasonics for the simple fact that its supply chain logjam was screwing up the manufacturing schedule. They also realized that the profit margin can be sustained in an inflationary environment by simply removing these parts [1]. “Oh, we were going to remove them anyway because humans can see fine with just eyes and no radar, why can’t cars?” Tesla then turned up the marketing of the AI/vision hype lever once more to toss another shiny tech object and get buyers to ignore the fact that there is a regression of features in the newer cars going forward.

[1] https://youtu.be/LS3Vk0NPFDE

mgoetzke
So all those engineers are lying when they talk about this topic ?
9935c101ab17a66
Uh, which engineers are you referring to?
oxplot
Here's the summary (mixed with observations from Munro and past Tesla presentations):

- Costs money: the physical sensors (a dozen of them), wiring it up, assembling it, maintain inventory, code it, etc.

- Time spent on maintaining, improving software stack for the non-vision sensors as well as efforts needed to fuse the data with vision, takes away from focusing on vision alone. It also holds back vision in relevant areas.

- Existing non-vision sensors used by Tesla are orders of magnitude lower fidelity than vision. It has historically (as the case with radar) led to vision essentially having to overriding radar because vision just performed much better (see AI day 2021).

My take:

As with any new tech, it likely sucks at the start (think HDD and SSDs, and how a mechanical thing with lots of moving parts was way more reliable than SSDs at the start). However, by essentially moving past the local maxima, you get to innovate better, faster in the future.

In case of ultrasonic sensors, they are for low speed cases anyway and most people are fine without them. Majority of fatalities and injuries happen at higher speeds.

rootusrootus
That's great for them. But when I'm shopping for a car, I get to choose between a manufacturer that installs the extra sensors and seems to be able to get them to work, and Tesla.

Used to be that Tesla was blazing a trail and if you wanted a good EV, that was what you got. Now, if you want the best EV, it's usually not going to be a Tesla. And I don't see that they're making any decisions that will regain them that title. The incumbent manufacturers are quickly proceeding to eat their lunch, just like many of us predicted would happen. Turns out the hard part of making a successful car isn't the drivetrain.

oxplot
> Now, if you want the best EV, it's usually not going to be a Tesla.

Would love to hear what you consider "good" and what specific EV ticks the most good features that a Tesla Model 3 doesn't.

rootusrootus
Actual interior controls, to start. Tesla needed to save money on the interior and the giant consumer grade LCD was a really great way to do that. But now that the incumbents have entered the game, the standards are going up. The Ioniq 5 has two screens, one in front of the driver, and actual climate controls. The Mach E has a screen in front of the driver, and actual tactile controls as well. This is basically true of all non-Tesla EVs now.

CarPlay? Almost all cars, expensive or cheap, have CarPlay. Except Tesla. And you can throw Android Auto in there too.

Rain sensing wipers. That work, I mean. A firmly established technology at this point that Tesla can't make work. Then they insult you by including substandard manual controls so working around it is a chore.

360* surround camera. You can say that Tesla might have enough cameras on board to do it if they want. But they don't seem to want. So that's a feature that does not exist on a Tesla that I can get on a number of other EVs.

Parking sensors. Blindspot warnings (actual working sensors like most regular cars have now, not the sometimes-it-notices camera-based warnings my Model 3 would occasionally give me). Radar cruise control.

The best hands free driving is SuperCruise, and I can get that on a Chevy Bolt EUV, which costs something like 20 grand less than a Model 3.

I can get OTA updates from other manufacturers too. Notably, without the 'screw you' attitude that Tesla has, where they happily turn off customer features they don't feel the customer should have. Everything from taking away battery range without permission to turning off radar hardware that is already installed. Tesla is a consumer hostile company.

You have a point with the charging network, though that is quickly becoming moot since the non-Tesla networks are collectively growing faster.

Range is good on paper for Teslas, but my real world experience is that my Model 3 never got anywhere near it's EPA rating, my wife's Bolt usually exceeds it. As a practical matter, my P3D had less actual range than the Bolt even though the latter was only rated at 259. At least the supercharger was faster, which meant I spent 10 less minutes charging on our regular 300 mile round trip to Grandma's house.

I'm in the market again for another EV, and I can't find a single compelling reason to go with Tesla. The Model 3 is arguably worse now than the one I bought a few years ago, whereas the competition has been adding new features every year.

lnsru
360 degree camera on BMW i4. Overall build quality too.
oxplot
I don't know if you're addressing what makes a good "EV" or just listing some features you personally care about. If the former, then 360 degree camera and build quality has little to do with an EV. Tesla is a young company and build quality has dramatically improved over the years (see Munro) and will continue to do so. Tesla has close to 360 camera coverage, it just hasn't added it in software. All it takes is a software update on existing hardware. If they haven't done it yet, it probably means they have other priorities at the moment.

Some EV specific advantages of Tesla over competition:

- Range: Teslas continue to lead in range, as result of best aerodynamics, best drive train efficiency and vertically integrated design (see Munro - highly integrated component).

- Charging infrastructure: largest available and most reliable (see MKBHD for e.g.).

- Regular software updates improving everything from range, to charge planning (recently added much better estimate based on wind direction, inclination, tyre pressure, weather, etc) to safety, etc.

- and most important of all: Availability and production volume. Best EV in the world isn't worth anything if you can't buy it.

If you look forward and see what Tesla is doing with its manufacturing innovations, it should become clear that things like in-house battery production, single front and rear casting among numerous other things has already secured their top position for sometime to come.

lnsru
Maybe in US. Tesla has nothing to throw against some serious cars like Mercedes EQS or EQS SUV. Hyundai is also catching up. Charging infrastructure isn’t an issue in Europe anymore and is getting better and better every year. Tesla supercharger for free at the beginning was a big deal. That’s why I am looking for used model X from 2017 right now. 5000€ in free electricity a year is a nice advantage. I also have open order for model Y performance from Berlin factory. Sorry, the lead time shitshow is similar to all others. Every month my order is being delayed by 2 months for months. I have only one additional option ordered - black color. Last time I checked delivery time was updated to “TBD”. Lol.

Edit: this year generous funding for new electric vehicles in Germany is about to end, that’s why the last months of 2022 are so chaotic regarding lead times.

oxplot
> Tesla has nothing to throw against some serious cars like Mercedes EQS or EQS SUV.

Besides sensationalist comments, do you actually have a list of what areas the Mercedes excel that Tesla doesn't?

I can't take "serious car" seriously - like what, Tesla is kidding?

lnsru
Let’s talk about my future Model X 2017 with free supercharging. The chassis is disaster from beginning to the end. The car has air suspension, it’s nice. But! You go with low setting, tires don’t last 10k miles. You go with high settings, axles are suddenly broken. After 60000 miles all parts under the car must be replaced. Not funny and not cheap either. Last car I heard of such nonsense was 25 years old Renault Laguna 1. Mercedes or BMW do not need chassis rebuild every 100k km.

Btw EQS has greater range.

oxplot
> But! You go with low setting, tires don’t last 10k miles.

It's well known that EVs in general due to their weight, and especially those with high torque cause the tyres to wear out much faster than their ICE counteparts. 10K miles sounds too low. Would love to know why low suspension setting on a Model X causes much faster wear, and whether for the same efficiency/range on a different vehicle, the tyres last significantly longer.

> You go with high settings, axles are suddenly broken.

We know media is trigger happy when it comes to the most minor Tesla related incidents and yet, I've never, in the 10 years of following Tesla, heard about this one. Can you link to perhaps a bunch of these?

> After 60000 miles all parts under the car must be replaced.

Again, never heard of these issues and I watch daily Tesla news from multiple sources.

> Let’s talk about my future Model X 2017

I'm puzzled. Despite all the issues you listed, you're still going with Model X just because it has free supercharging? You do realize that broken axels can kill you, right?

lnsru
This is typical thread talking about tyre wear: https://teslamotorsclub.com/tmc/threads/caution-model-x-hidd...

Nice tire on the outside, completely gone on the inside. There are now enough aftermarket parts and repair shops knowing what parts to replace and how to alight the wheels so that in low suspension setting the car is usable for long time. Sadly full air suspension potential can’t be used, but I can live with that. Internet is full of aftermarket parts claiming to solve these problems for all Tesla models. Finding proper repair shop is however hard. Love for details is not popular trait between underpaid mechanics.

I can tell you one thing: model X has no competition. Mercedes EQV comes closest, but is bigger and the range is very poor. VW buzz has no 7 seats and it’s price is insane. I also don’t like companies gazing apes: https://amp.theguardian.com/business/2018/jan/29/vw-condemne...

oxplot
> But! You go with low setting, tires don’t last 10k miles.

It's well known that EVs in general due to their weight, and especially those with high torque cause the tyres to wear out much faster than their ICE counteparts. 10K miles sounds too low. Would love to know why low suspension setting on a Model X causes much faster wear, and whether for the same efficiency/range on a different vehicle, the tyres last significantly longer.

> You go with high settings, axles are suddenly broken.

We know media is trigger happy when it comes to the most minor Tesla related incidents and yet, I've never, in the 10 years of following Tesla, heard about this one. Can you link to perhaps a bunch of these?

> After 60000 miles all parts under the car must be replaced.

Again, never heard of these issues and I watch daily Tesla news from multiple sources.

> Let’s talk about my future Model X 2017

I'm puzzled. Despite all the issues you listed, you're still going with Model X just because it has free supercharging? You do realize that broken axels can kill you, right?

a-dub
i'm not sure if i buy his argument that the "delta is not big enough." i have some experience with realtime ai systems and i've noticed something interesting about them.

they have a non-smooth capability curve, where they can demonstrate proficiency in activities that in regular computer programs or people would imply a complete and continuous path of capability that has been mastered to achieve the demonstration, but ai systems are weird in that can do amazing things, but have loads of little holes and failure modes along the way.

for example: gpt-3 can write you a shell script that will emit a c program that prints a poem about people you know, but will fail at very basic logic, sometimes.

in light of that, having additional support data like radar or lidar seems like the right move for plugging all those little holes in capability that turn up in real ai systems.

because at the end of the day, when you're driving a car in the real world and lives are at stake, simply interpolating or averaging over uncertainty seems awfully deadly and the only way to ameliorate that uncertainty seems to have multiple redundant sensory systems that can stand in for each other as conditions change. just like us!

Nomentatus
They do "fall off the edge of the world" a lot; but so do human neural networks; I've seen a bad crash as a result of a human simply pulling out of a driveway right in front of a motorcycle, 'cause they're rare. She had tagged the motorcycle as a bike while it was farther away, then boom. Her interpolation (while checking the other side) didn't work, and her averaging over uncertainty didn't work either because motorcycles are rather rare up north, they aren't the average vehicle. I've made a similar error re a kid on a wall (he suddenly jumped directly into the bikepath) but managed to avoid him (my bike zoomed to his left and I tumbled past his right. He wasn't hurt, although I got a severe wrist sprain from throwing the bike to the left.) As a driver I behave very differently around kids on walls, now. It was just an edge case I'd never encountered, and I didn't have enough data to calculate under uncertainty.
api
I see surprisingly little discussion of overall statistics on safety of self drive vs humans, and what I do see is often self reported by companies or by equally potentially biased sources in the media. I’ve searched many times and a straightforward stat seems hard to find.
hwillis
It's certainly not straightforward. You know how 90% of crashes occur within whatever many miles of home? People don't use autopilot while they're cruising around their suburbs or cities.

Companies would have to disclose not just their accidents, but also how much driving is done in different places. Then all those places would have to be compared or at least categorized. Its complex enough that there would definitely not be a straightforward, obviously-correct way to interpret it all.

Since companies do not disclose where self-driving occurs, the whole thing is a complete nonstarter. It's whatever they choose to disclose.

Nomentatus
This does spark a thought in me - at some point Telsa could create a proper test, with a portion of their drivers being asked to not use self-driving for a month, say. Or a free trial for a month that only half of Tesla users not already subscribed (randomly chosen) are eligible for.
touch_abs
There isnt really a way to gather the data free from a manufacturer, every current self-drive mode has limited operating scenarios, if you want to record an incident you first need the manufacturer to confirm that it was operating, and that it was one of the intended use cases.

Companies are less interested in safety vs a human to begin with, they have a emphasis on not causing any accident they are culpable for.

dreamcompiler
There's still too much apples and oranges for direct comparison. Tesla FSD doesn't make some of the mistakes human drivers do (like dozing off or being drunk) but it introduces new mistakes humans rarely make (like driving under semi trailers crossing the road).
01100011
I still suspect it's because they need to preserve compute resources for vision processing. Sensor fusion is likely eating up too much of their current HW and limiting their progress in other areas. I suspect Tesla will have to admit they need to upgrade the current HW before they ever 'solve' FSD.
vhold
One camera produces millions of bytes of data every single frame, an ultrasonic sensor is useful producing just 1 or 2 bytes of data in the same time span. (distance to something within the sensor's cone).

So it seems like a totally ridiculous argument that ultrasonic sensors create some kind of data processing overload.

An ultrasonic sensor makes it possible to implement incredibly simple and reliable safety features with well known performance characteristics. Processing an image with ML to produce the same effect has tons of edge cases where it might not work, and nobody knows when it won't work, and every update to the system could introduce regressions.

It's why they had to disable certain features when they got rid of the ultrasonic sensors. Those features may come back some day, but I bet they'll never be as reliable, and certainly won't be as predictable.

https://www.pcmag.com/news/tesla-removes-ultrasonic-sensors-...

01100011
Not saying they create a data processing overload. I'm saying they're fed to a deep learning architecture that must then try to fuse disparate sets of data into a coherent picture. The neural network becomes simpler when you remove that function and just focus on visual processing.
another_devy
I think not using LiDAR would be a good bet. LiDAR in nutshell allows you to give an object in vision 3D space, relative speed which 2 human eyes can do very fast. Problems with solving this in vision based input is in dataset and interpretation. Computer vision and AI can’t effectively apply a human drivers judgment with better camera and processing power, at least not yet.
snovv_crash
The amount of compute that sensor fusion uses is miniscule compared to running a NN or computing stereo depth maps. Sensor fusion runs in the background of your phone the whole time to power things like [0] for example.

0. https://sensor-js.xyz/demo.html

Nomentatus
Well, yes and no. Integrating the data and adjudicating conflicts between sensors is a real task, too. Also having just two opinions doesn't necessarily help if they conflict, and the lidar is the thinnest source of data. How do you coin flip that? You likely end up just discarding the Lidar's conflicting opinion.
snovv_crash
LIDAR data might be more sparse but it is also more reliable, being an active sensor so it isn't affected by night.

The real way is to use the LIDAR to add to the depth probability distribution in the stereo depth estimation. This way you aren't throwing any data away. LIDAR often gives probabilities as well, for example, and this can be used to eliminate reflections.

01100011
You are confused my friend.

Reading sensor data is not the same as feeding that data to a neural network and asking it to form a worldview composed of possibly conflicting sensor data streams(i.e. lidar vs vision vs ultrasonic).

You are somewhat correct that it is quite trivial to read sensor data. For many sensors, there is some work which needs to be done to denoise or cleanup the input data. That's not where the story ends, however.

snovv_crash
In order to display the gravity-aligned acceleration, sensor fusion between gyro and accelerometer has to occur. This is typically done with a Kalman filter, and runs on 1960s levels of hardware. If you look at something like a drone autopilot, eg. Ardupilot, the sensor fusion is soo cheap that they even extended the Kalman filter to also estimate things like the sensor bias offsets and earth's magnetic field vectors.

Sensor fusion is computationally cheap. It's just a lot of R&D to do it in a way that leads to a net gain in precision and robustness.

01100011
We're talking about radar and ultrasonic sensors here, not accelerometers. We're also talking about feeding them to a deep neural network. Not the same thing. Sensor fusion is not being done with a Kalman filter in this case.
snovv_crash
Radar and ultrasound both give drastically less data than a simple 720p webcam. After postprocessing their output bandwidth is more similar to a 9-axis IMU than a camera.
01100011
Yes and each time you process the incoming camera data in the neural network you have extra calculations due to the sensor fusion with another data source, regardless of its bitrate. Have you worked with deep learning much?
JaggerJo
This sucks for parking. It is simply (physically) not possible for the existing cameras to see the area directly in front of the car.

So how would this work for parking?

A: Add more cameras so there are no dead areas in front of the car

B: build a model in vector space when driving towards a parking spot and assume blind spots don't change. (still sucks)

oxplot
> It is simply (physically) not possible for the existing cameras to see the area directly in front of the car.

Think about how a human driver does it, given his/her even worse vantage point. They model what's in front/behind the car from afar and remember what's where as they approach it. There are other signals as well, such as continuation of a kerb, etc.

I think people keep forgetting that Teslas run hundreds of ML prediction tasks all the time. Watch recent AI day and their talks about "occupancy network" to get a sense of the car's ability to:

1. Construct 3D model of its surrounding in real time; 2. Remember occluded sections based on what's it's seen previously.

watwut
Human driver constantly turns head around to where he is Mos likely to hit something.
oxplot
Well, the car has 360 degree camera view, with far wider coverage than a turning head in a driver seat.

And more importantly, it sees in all directions at all times.

yreg
They are pretending as if the USS were there only for self driving.

I use them as well!

georgeg23
Indeed the ultrasonic sensors are pretty critical for (human) parking and backing up.
throwaway4good
Let me reframe the answer:

"We removed them because they cost money. And we are trying to make money ... at least right now.

Listen, this pure autonomous self-driving car stuff is never going to work, so who cares if we have these gadgets or not ..."

Animats
From my DARPA Grand Challenge days, I used to have an Eaton VORAD automotive radar. This was an early design - 24GHZ, 1 scanning axis. It could see cars, but not bicycles, at least not reliably. For several months, I had one pointed out the window of my house, looking at an intersection. So I had a V-shaped wedge on screen, and could watch the cars go by.

It's a Doppler radar, so you don't get any info from things stationary relative to the radar, but you do get range and range rate. And the quality of that data is independent of distance. We used it mainly as a backup system for the world model built with LIDAR and (to a very limited extent) vision. The VORAD data could lower the speed limit for the rest of the system, and if a collision was about to happen, it would slam on the brakes independently of the world model.

The big problem with coarse automotive radar is that it can detect targets, but doesn't tell you much about them. Cars, trash cans, and metal road debris all look about the same. There's also a lot of trouble from big flat metal surfaces being mirrors for radar. We were willing to accept slowing down for ambiguous cases until the other sensors could get a good look. Drivers hate that if road-oriented systems do it.

Modern units are up around 70-80GHz and often have 2D scanning, which is a big help. I haven't seen the output from a modern automotive radar. I was expecting that by now, low cost millimeter microwave systems (200-300GHz) would be available, providing detailed images somewhat coarser than you can get with light. You get range and range rate, and you can usually steer the beam electronically rather than mechanically. The technology exists to get high-resolution radar images, but is mostly used for scanning people for weapons at checkpoints. It hasn't become cheap yet.

dane-pgp
I think there's an interesting general optimisation problem here of balancing the accuracy/performance of a software/hardware system, against the goal of making that system easier to iterate on and develop.

Presumably this is a matter of working out if you are at a local maximum or not, and thinking about what properties the ideal solution will have. It also matters if you have other competitors that might be racing towards the ideal solution faster than you, potentially patenting their progress along the way.

friend_and_foe
I remember watching an interview with George Hotz when Comma.ai was young, where he essentially said this as a critique of Tesla. He's a bit of a showman and likes to invite a little controversy when he says things, but I found myself agreeing with his point. It's not surprising to see such a practical company like Tesla face the facts about all these sensors eventually.
P_I_Staker
> such a practical company like Tesla

Where are you getting that from? Tesla has always seemed pie in the sky, and hardly a down to earth company at all throughout the history.

I'm basing this one both their public record, and reputation within the auto industry.

friend_and_foe
Tesla created the modern electric vehicle industry. They're innovators, sure, and they push limits, but their priority has always been to actually build. And they do.
60Vhipx7b4JL
From an engineering perspective I would ask: Can your sensor package understand the environment to the required (low) failure rate?

Radar/Lidar/Ultrasonic is going to give you information that your camera systems will not give you. It does not matter if the delta of information is little. If this little is required because you can't obtain it otherwise, you still need it.

If you just rely on the fleet, you rely on the things you have seen. What about the objects that you have not yet seen?

post_break
I think this is the real reason: https://www.youtube.com/watch?v=LS3Vk0NPFDE

Cost cutting.

taf2
Probably a benefit but also imagine the difference in software. You Boolean logic like radar says we are gonna hit , vision sensor says we have nothing there, sonar indicator says nothing there… so the idea of having just a really good single source of truth probably makes a lot of code a lot less complex… I have no way of knowing either way but from a less is less that can break point of view this seems kinda good… like many things only time can tell and at least we have different groups of people pushing on different potentially viable paths forward so that we can soon hopefully know if self driving is possible (wide scale) one way or the other
nova22033
At 2:05. Suddenly you need a column in your sqlite telling you what type of sensor it is....

Seriously? This is a major technical challenge?

danpalmer
The challenge isn't the storing of the flag that says which sensor it has, it's testing the combinations, training for the different scenarios, treating the incoming data differently, and so on.
gnicholas
How does it make sense to not even have sensors for parking? If you think they don't help during normal-speed driving, that's one thing. But they obviously help during parking, since (IIRC) they've had to disable certain autonomous features until they get their vision-based systems upgraded to be able to fill in this gap.
eachro
So the key question is how much of an improvement does radar/sensors/etc give you over just using computer vision?
throwntoday
If we're to trust what Elon and the team said during the last few AI day, none. They stated that the ultrasonic and radar sensors were actually performing worse than their pure vision stack.
quonn
I‘m ready to be convinced that this will be true at some point for the ultrasonic sensors. But by design the radar can see things that vision can never see. It seems like a bad idea to take that away.
throwntoday
Right but I think the signal to noise ratio was eating to much compute for what was little payoff. And either the ultrasonic or radar don't even work above 10mph, I forget which. They are used purely for parking.
None
None
justapassenger
Real life performance of vision only stack doesn’t agree with it.
kevin_thibedeau
Vision systems don't work at all in fog or heavy rain/snow.
nicbou
Any more or less than the human equivalent?

I'm not following the news, but I haven't seen any videos set in what Canada looks like 4 months per year.

Dunedan
Up to a certain degree they work, as humans can drive in fog or heavy rain/snow as well. If visibility is so bad that a human wouldn't be able to drive, I wouldn't want to sit in a self-driving car either, no matter if it does use vision only or has additional sensors.
m463
A better "answer" might be to make them an option and let the market decide.

For many (MANY) years airbags were fought by the auto industry even though people wanted them.

diskzero
As someone working in the field, I would never choose to eliminate the information provided by radar, lidar and any other sensor technology. Depending only on camera information would be too limiting.
bpanon
You haven't solved the problem though.
xnx
Sensor fusion seems to be another thing that Tesla is not good at.
EVa5I7bHFq9mnYK
From the first principles point of view, it comes down to radar and ultrasonic having much higher wavelengths than optical. Which results in much lower amount of incoming information, worse resolution and higher interference if many cars radiate the same signals on a busy street.
0xfffafaCrash
Seems like a very political answer from Andrej. Of course he’s not going to outright say “yeah, we’re prioritizing the profit margin over accuracy and safety considerations” if he wants to keep his job, but that seems to be the short of it. Others may choose to follow, at least in the short term, but it won’t be because of “entropy” making the system worse (you can always build a model without a data source and then refine the results based on the added value of a data source) but doing so just because it will save lives doesn’t cut it when the goal is to cut corners and costs to maximize profit. I can believe that some types of sensors aren’t worth the trouble in terms of additional signal to noise ratio, but I can’t believe this is one of them.
Nomentatus
The entropty he's taking about comes from many sources, in particular opening yourself up to maintenance or supply side errors, and just overloading team attention (always at a premium) for no net return. It's not just CPU cycles (although that's part of it, that hardware could be doing something else useful.)
jakeogh
Giving human[1][2] drivers better situational awareness[3] is the future. Specifically open[4]:

a. Windshields that clean the inside as well as the outside.

b. Better eyeglasses[5].

c. User controllable hi-res HUD thermal IR overlay.

d. Headlights with adaptive notch filters so the oncoming vehicle can pick an empty spectral range... without the source being monochromatic (with required adaptive filters on the recieving end)... and/or really good coronagraph's.

e. Brake control[6].

Any entity capable of driving[7] in a population of humans (including adversarial humans) is sentient[8], and has real skin in the game. It would be unethical to lock one in a car:

[1] https://news.ycombinator.com/item?id=33213860 (analog FPGA)

[2] https://news.ycombinator.com/item?id=21106367 (general AI)

[3] https://news.ycombinator.com/item?id=16646112 (2018)

[4] https://www.tesla.com/blog/all-our-patent-are-belong-you (2014)

[5] https://patents.google.com/patent/US7744217 (2007)

[6] https://news.ycombinator.com/item?id=18013388 (2018)

[7] no human behind the wheel, no human to correct impending mistakes, but (critically) with one or more humans in the car.

[8] The idea that non-biological machines can have 'self' is a window into modern mass transformation. Please checkout the analog FPGA experiments linked above.

danbmil99
Musk is also famously against using lidar. He doesn't understand/accept that an autonomous vehicle needs any sensors that humans do not posess.
sidcool
I feel that was more an operational answer than an engineering one .. I still feel that depth perception of vision alone is unreliable.
dncornholio
Just remember folks, we will have full self driving vehicles by the end of this year!
ra7
If I were a Tesla fan/investor/FSD customer, I’d be very concerned that the former (effective) tech lead of FSD doesn’t know about sensor fusion or that it’s a solved problem for majority of the companies in this space.
lawrenceyan
I can see a path where with only cameras, Tesla might be able to reach level 4 autonomy in perfect conditions.

But the biggest thing that comes to mind is what happens at night. Are they only going to enable self-driving during the day?

speedgoose
Wouldn’t turning on the headlights fix the problem at night?

Snow and ice may be another challenge but night sounds easy.

lawrenceyan
When you drive at night with headlights, tell me honestly how confident you feel driving versus during the daytime.
ornel
Video summary:

https://www.summarize.tech/www.youtube.com/watch?v=_W1JBAfV4...

cainxinth
“Sensors aren’t an asset because you need to source them and install them and they can break, and it slows things down and adds cost and complexity.”

Uh, that’s true of every part on a vehicle.

superkuh
Humans don't use radar or ultrasound sense to drive. If we want cars that drive like humans drive they should use the same senses. For example, in the northern parts of the USA there is snow cover for much of the year and lanes are emergent from flocking without any absolute reference to the actual location of the lanes. The reasons everyone choses the same places to drive are that they see the same environment with the same senses. Even if autonomous driving with radar and ultrasound was made to work if it picks the correct lane position and all the humans pick the wrong new lane position then the car is wrong, not the humans.
mavili
Did anyone else catch Andrej's "sqlite" comment? If that is not just a simple analogy, Tesla may be using sqlite in their cars? :D
sgjohnson
Does this mean that now when someone smashes one of their bumpers on a Tesla, the insurance will no longer have to total the entire vehicle?
mongol
What is it that makes Lidar so expensive? Is it something intrinsic to the technology that prevents costs from coming down?
diskzero
A LIDAR sensor is a complex device with spinning motors, mirrors, lasers and more. Costs are coming down and less-expensive and more capable devices are coming to market. Once the price-points come way down, I wouldn't be surprised to see Tesla reconsider their decision to exclude them from their sensor platform.
frxx
There are also solid state LIDARs these days which involve no moving parts. Still use more power than radar though.
Nasrudith
They are also fairly power-hungry from what I heard.
fooblaster
Lidar is coming down significantly in the next few years and is available from multiple tier 1 automotive suppliers. Technologies like high power vcsel arrays and highly integrated and photosensitive SPAD arrays/detector logic are making this possible. Prior lidar devices used non automotive grade discrete components like edge emitting laser diodes, high speed adcs, and APDs. These were expensive and hard to integrate, and aren't present in mass market inexpensive AM lidar coming to the market.
bekantan
He explains it quite well: all necessary information is already in the pixel-space and adding more sensors slows team down more than it improves the system performance. My understanding is that major blockers are not in perception area anyways, would be great if someone with relevant experience could comment if this is indeed the case.
6stringmerc
I have driven in extreme rain flash flood conditions in north Texas and I consider this a specific challenge, natural, that would defeat his system.
pclmulqdq
Any amount of snow would do this too. It severely reduces the color space of road features.
jiggawatts
Tesla cameras are not RGB, they're WWWR (white-white-white-red). Essentially they have a Bayer array, but only for 1 red pixel out of four, the other three are black & white. I believe that the W pixels aren't homogenous either, there is some design aspect that enables them to cover a wider range of intensities so that they can handle both darkness and full sunlight.
diskzero
I am a principal engineer for a major autonomous vehicle company. You can break this statement down into two components:

Adding more sensors slows his team now more than it improves system performance

I'll take his word on this. It is a lot of work to incorporate multiple sensors.

All necessary information is already in the pixel-space.

I hate to disagree with someone as distinguished as Karpathy, but this is simply not what I have observed from all of that data that we have access to. Given my knowledge of the various stacks deployed today, I would never ever ever get into a vehicle using a vision only stack and expect it to perform in some of the challenging environments encountered during testing.

kfarr
Full on agreement. There are literally videos of Teslas smashing into stationary vehicles on the highway at night using only vision camera for FSD. No way any rational actor could claim the visible pixel space is sufficient in that scenario compared to LIDAR, Radar, etc
alsodumb
It's funny you use Radar as an example of 'good sensor' while it is well known that most or maybe almost all?) of the stationary vehicle accidents you're talking about happened because of Radars inability to detect a stationary obstacle.

On the other hand, RGB data does have that information, we use it everyday to avoid obstacles, even under foggy and rainy conditions (I'm no LIDAR expert but I know it sucks in rainy conditions)

I am not saying I support a vision only stack, but all I am saying is it is certainly possible to deploy a vision only stack in the future.

threeseed
> Therefore, for this simple ADAS algorithm using roof mounted LIDAR, heavy rain does not prove to be a particularly important factor in the system performance.

https://www.mdpi.com/2079-9292/8/1/89/htm

krapht
You mean some radars inability to detect stationary obstacles. Clutter rejection has a lot of more sophisticated algorithms to apply with greater compute power to throw at the problem.
Nomentatus
These were flat truck/ambulance surfaces encountered at an angle - exactly the conditions that the first stealth fighter, with its angular surfaces used to evade some of the best radar in the world, because from most angles, no radio waves would be reflected back to the radar sensing device. It's hard to get the job done with nothing.
cma
Compare their occupancy map with what you get out of the latest LIDAR Waymo is using and it is scary (occupancy is harder as it fills in what is occluded, but Tesla's looks like Minecraft-style 1x1x1m resolution).
Dunedan
Out of curiosity: Could you please elaborate what such challenging environments can be?
jbverschoor
It’s good enough for people, so all the info is there.

Doesn’t mean it’s better or easier

alsodumb
I think one should distinguish between 'all necessary information is already in the pixel-space' vs 'we already know how to extract all the information needed from pixel-space'

The fact that (most) humans manage to drive around safely and successfully in current roads proves that the information needed exists in the pixel-space (not just current image, but say current + history). We don't yet have stacks that can successfully map everything needed from this information but I don't think Dr. Karpathy ever claimed that.

(I am not a principal engineer but a mere PhD student who argues daily with people on how RGB information is underappreciated and under utilized)

diskzero
I'll agree with you that there are still techniques to be discovered.

I also agree that most humans manage to drive in challenging conditions, but their margins for error become slimmer and slimmer. I personally want my autonomous robot vehicle to be way more efficient and safer than the best human operator and also able to deal with conditions that any sane human would pull to the side of the road when encountering.

bumby
>their margins for error become slimmer and slimmer.

Can you elaborate on this? I've always felt like the margins of error are getting wider because the automotive tech (particularly safety features) are so vastly improved. I doubt people would be able to text and drive as much, for example, if they were driving a 1950s era Willys jeep just because it requires so much more attention to keep on the road by comparison to modern vehicles.

alsodumb
Definitely agree with your second point! In theory, the reaction time and complete environment awareness should itself make an autonomous system way safer than human drivers.

In some way, I am against the philosophy of using HD maps + LIDAR data for highly accurate localization which most companies seem to be using these days. I believe that this approach is inherently brittle and is an 'easy way out' to the hard localization problem. I think more resources should be put into developing more natural, no HD map dependency techniques.

PS: It is my understanding that most of the major players were using HD maps, not sure if it is still true.

m463
bleh. auto accidents are the #1 preventable cause of death for kids:

https://en.wikipedia.org/wiki/Preventable_causes_of_death#Am...

9991
Stop calling them accidents.

https://capitolfax.com/2021/01/26/aaa-wants-us-to-stop-calli...

threeseed
> The fact that (most) humans manage to drive around safely and successfully in current roads proves that the information needed exists in the pixel-space

But that doesn't mean that it translates to a car.

We constantly move our 576MP resolution eyes in multiple orientations in order to visualise a scene and focus on the most important areas. Cars have fixed, low-quality cameras.

We then interpret this data using the most advanced pattern recognition system the world has ever seen that is trained for at least 20+ years to fully comprehend the behaviour of everything this planet has to offer. Cars don't have anything close to this.

robocat
> 576MP

Actually our eyes are more like 8MP: https://www.picturecorrect.com/what-is-the-resolution-of-the...

Perhaps higher synthetic resolution from moving our eyes about, or perhaps that is meaningless.

enragedcacti
It could be reframed as saying we have a peak acuity equivalent to a 576MP camera of the same FOV with a theoretical max of 20 samples per second (50 ms to move targets, realistically probably more like single digits). The 8MP comparison is only relevant if there are so many targets that need constant full resolution that you can't focus on all of them or the targets are so large that they are larger than the peak acuity FOV. In practice this is not the case because we can identify something once and keep tracking it in the periphery without issues and something that large will likely be extremely easy to identify.
robocat
That doesn't make sense: a camera doesn't get more pixels just because the camera is taking a video tracking something. Neither if it had zoom and a controlled gimbal.
enragedcacti
If you turn that tracked video into a panorama it would. Or if you took 10 zoomed photos and stitched them over top of an unzoomed photo. The point is that unless the task demands more focus areas than the eye can focus on in a given window then the visual acuity (for the parts of the scene that matter) is higher than an 8MP shot of the entire scene.
alsodumb
You kind of want to make it seem like a 576 MP resolution (where did you even get this number from while people still argue about a fair comparison between human eye and a camera?) or having to move your head/eyes to visualize your surroundings rather than actually having multiple fixed cameras covering the entire surroundings all the time is a good thing? If the resolution mattered that much, every car would have ultra-high resolution cameras on it.

Humans certainty have a stronger and general prior to make sense out of the information, and that's exactly why I left it as a possibility. Cars don't * yet * have anything close to it, just like they didn't have a way to accurate detect objects a few years ago and just like they didn't have a way to capture RGB information a few decades ago.

I am an optimistic guy, and I certainly believe in the power of learning at scale.

nielsbot
All I heard was "cost savings, cost savings, cost savings"
oxplot
Well watch it again and again and again. He talks about the determental effect of lo-fi sensors in conjunction with vision among other things.
solardev
Cuz Muskdaddy wanted mo money. There, mystery solved.
julienreszka
Geohot said something similar years ago already
smrtinsert
Hm I'd rather have someone from Twitter audit this decision
bigtex
Did Lex ask him why Tesla love to crash into emergency vehicles?
sidibe
No chance, Lex is in thrall to Karpathy and Elon. Sometimes when he is interviewing unaffiliated AI experts he has basically asked them to gush about them.
KVFinn
TLDR: Tesla thinks LIDAR hardware is more expensive than the performance improvement it provides.

I didn't like his line of logic about how vision is necessary and sufficient, because that's how humans drive. Okay sure, but if some combinations of non-human sensors could drive better and/or cheaper than a vision only driving system, surely he would not argue for sticking with vision only? Maybe adding non-vision sensors lets you save hardware and software resources on the vision part of the system.

Haga
None
justapassenger
TL;DW.

Tesla doesn’t know how to do change management.

Nomentatus
Given Andrej's explanation this verges on mere gainsaying. Could you expand on what you think they should be doing? Other firms would also encounter "entropy", they always do; what's your way of reducing that severely?
justapassenger
Andrej refused to answer the question about why, and instead gave whole speech about how change management and supply chain is hard. What else is there to expand on?

And of course other companies encounter it. But this has very strong vibes of “no one knew health care can be so complicated”.

And what should they do? Change management. Yeah, it’s hard, and costs money, but it’s a hard industry to be in. What they should not do? Cripple products they already sold (like disabling radar in existing cars), because they need to milk their profit margins.

Andrej is a great researcher. But safety critical systems are very different from research projects.

Nomentatus
Andrej did answer, he said that the delta was tested and found wanting, then detailed a number of risks that came with keeping Lidar. Positive but small. Risks larger. That exactly answers "don't you want all the help you can get." (paraphrase, from memory.)

Securing supply chains is hard period. Tesla and SpaceX have gone to heavy vertical integration because ensuring quality seemed to insist upon that.

What specifically would the new management team do that would secure supply chains trivially, etc? What's not being done?

justapassenger
Change management has nothing to do with people management.

It’s about managing changes to your product. It’s extremely important part of any safety critical system. And something that most new companies ignore, as it’s costly.

Nomentatus
And maintenance etc. But if you don't gain significant benefit, it's reasonable not to take on the risks of managing changes to a product you can do without because it doesn't, itself, significantly reduce risk.
Veserv
They said “change management” as in the management of changes not the replacement of the management team.
Nomentatus
I think they've actually said both at different times.
CharlesW
I thought it was telling that Andrej immediately "reframed" the question because Lex asked the "wrong question". This is a classic evasion technique one learns from experience and/or media training. Lex's comment immediately after was a clever and gentle dig at Andrej's response.

It seemed like all the "full cost" negatives Andrej mentioned were related to Tesla's ability to execute, and not what would actually produce better results. Tesla would have to be able to reliably procure parts, write reliable firmware, create designs and processes that won't increase unexpected assembly line stops, etc.

Regarding results, the best Andrej can do is, "In this case, we looked at using it and not using it, and the delta was not massive." In other words, the results are better, but not enough to make up for the fact that Telsa can't support additional sensors without incurring a prohibitive amount of additional risk to Tesla. Risk to passengers doesn't appear to be a consideration.

FreakLegion
> In other words, the results are better, but not enough to make up for the fact that Telsa can't support additional sensors without incurring a prohibitive amount of additional risk to Tesla. Risk to passengers doesn't appear to be a consideration.

You may be right about the actual decision process Tesla went through, but Karpathy is right in principle. One of the first things he says is "there can be problems with [the sensors]", and a lot of what he mentions increases the risk of run-time failure, not just cost.

It's easy to cast this as an optimization problem where you're trading off asymptotically improved sensing for linearly or superlinearly increased failure rates. There's certainly a point where the complexity of more sensors or certain types of sensors outweighs any marginal benefit they provide.

bumby
There are other ways of optimizing for reliability, though, like redundancy in parallel or higher spec’d sensors. But that still gets back to the same issue where they are going to be concerned about cost.
None
None
YZF
Taking his point to the extreme why use 8 cameras? just use 4? 1? One photo-diode?

Cameras can also fail at run-time there can (and is) be variability in how they're mounted, in the lenses, in the sensors. They can get blinded or not get enough light. Their cabling can fail random components can fail.

Tesla has claimed that vision outperforms vision+radar but anecdotal reports don't seem to support that conclusion. IMHO these technologies are not directly replaceable, but are complementary. It's like you can't replace your ears with your eyes (yeah, you can read lips, if they're visible).

But sure, there is a sweet spot. Is Tesla really optimizing for best performance at any cost or are they optimizing making more money and selling that to us as an improvement? That's really the question and I don't think we got a frank answer there.

FreakLegion
> Is Tesla really optimizing for best performance at any cost or are they optimizing making more money and selling that to us as an improvement?

More likely they had a fixed budget and optimized with that constraint, if they made a rigorous decision at all. But this is guesswork.

I'm not speculating about how Tesla made the decision, just commenting on Karpathy's answer. His answer is correct even if it isn't true, i.e. even if it isn't what Tesla actually did.

There are plenty of well-known analogs, like the mythical man-month. We all know that throwing more x at a problem is routinely counterproductive, even without cost as a constraint.

YZF
It's like the joke about the mathematician in the hot air balloon ... His answer is correct but it's not useful. It is correct there is some optimal solution short of an infinite number of sensors/technologies and larger than no sensors. The argument that Tesla is converging on the optimal solution vs. the more or less known reality that they couldn't get the components they needed to build enough cars is weaselly. But hey, necessity is the mother of invention. Also he can't actually share anything from his work in Tesla because presumably he's under NDA but he's gotta say something.
FreakLegion
The original comment I replied to:

> It seemed like all the "full cost" negatives Andrej mentioned were related to Tesla's ability to execute, and not what would actually produce better results.

This is objectively wrong, and it's the only substantive part of the discussion. The rest is fantasizing about things nobody actually knows ("It's media training!") and imputing questionable motives to someone who hasn't done anything to deserve that ("He only cares about Tesla's bottom line!").

dmix
You could take any one single point in a complex multifaceted argument to the extreme and basically strawman it to death. But that’s not helpful.

I believe his point was to provide a new perspective on the problem, not to reduce the problem to a single reason. I highly doubt the only reason Tesla chose to use vision only in the short term was motivated by a single datapoint.

Even if it was the most important point... in this one person (on a large team’s) mind... it doesn’t necessarily mean it was the most important in the sum of the complex process it took to get to the decision.

So I don’t really see the value in taking it to the logical maximum because it’s not only illogical that they would be evaluating this one idea in isolation but even on its own they would still be balancing the optimal performance they got from x vs the optimized value they got from y, then compare it to the teams ability to work with both x+y(+z) at the same time.

For ex: You’d probably need 8 cameras pointing different directions vs one highly capable rapidly spinning LiDAR to even compete with it, so why even ask? These problems a) always have context and b) can't be so easily simplified and broken down.

Although you might make a good point that Tesla used this same poor logical-maximum reasoning to determine why not get rid of ALL sensors besides vision.

quonn
> why use 8 cameras?

At least 6 are needed to get a 360 degree view around the car, which obviously is necessary. Think of the 8 cameras as a single better sensor. It's a question of having one very good sensor then or many to fuse.

BeefWellington
I would also add that Tesla's sensor systems, while perhaps higher quality, are not exactly new ideas. In one form or other laser/radar-based systems have been in cars going back to the 90s for early collision avoidance, automatic cruise control, etc.[1] Longer in other applications.

At least one study seems to suggest those sensors when deployed in automatic emergency braking systems do have a measurable impact on collisions.[2]

Let's say the failure rate on the sensors was 1 in 100 (I'd be shocked if that many were defective). That means 99 other Teslas are using mutli-sensor systems and not driving with degraded capabilities. It's an asinine claim that doesn't pass basic logic tests. The only way they weren't a substantial improvement is if Tesla's measurements were conducted in only the absolute most ideal conditions for cameras and no other scenarios.

[1]: https://en.wikipedia.org/wiki/Adaptive_cruise_control#Histor...

[2]: https://www.forbes.com/advisor/car-insurance/vehicle-safety-...

latchkey
Not just risk to passengers, risk to any thing in proximity to the vehicle while it is in motion.
threeseed
It was an ominous answer.

We really should be focusing on what is the best solution and trying to solve price issues through existing techniques e.g. economies of scale, competition, miniaturisation. Instead they are trying to build whatever solution they can that fits in a pre-defined cost window.

Except this isn't a new phone or sneakers we are trying to take to market it's something that will directly impact people's lives.

throwawaylinux
Everything in engineering has a cost tradeoff, and always has. And peoples lives are improved by things which they can afford There is no "best solution" you can talk seriously about without talking about cost.

Why not have a thousand sensors if more is better?

mensetmanusman
It can’t be solved without a few 10s of billion in infrastructure investment.
taneq
Whose money should “we” be spending on this grail quest?

This mindset is something I see a lot, that “best” means the technically optimal (or sometimes just personally most convenient) solution to the specific problem that they personally are working on. If they take a step back and look at the bigger picture, the technical merits are usually only a tiny part of the whole decision.

minhazm
There's an opportunity cost trying to get to the best solution. What about all of the people that die in the mean time while we delay rolling out something due to it not being perfect? Just doing some googling the Waymo stack is estimated to cost somewhere in the range of $50-100k, not including the car. A better solution that no one can afford is no solution at all.

Ultimately the only requirement is that the system is safer than humans by some margin that people are comfortable with buying such a system. If that amount is even as little as 2x safer than humans, we still have a moral obligation to roll that out even if we could be 5x safer if we had another $50k worth of sensors and processors on the car.

wonnage
Wonder if this is a strong argument for public transit too. While self driving cars are developed let make every effort to figure out if we can get people to take the existing self driving trains and subways…
AlexandrB
These kinds of moral arguments are silly because they hit a brick wall as soon as they encounter how society operates in practice. If Tesla has a moral obligation to roll out an FSD that's a just a little safer than humans, then do they not also have an obligation to make it available to all their competitors? If not, does every individual have a moral obligation to buy only Tesla cars? Do governments now have an obligation to subsidize Tesla cars so anyone can afford them? Etc.

And all this only considers first order effects. If a 2x safer FSD feels more dangerous that normal driving and thus reduces FSD uptake for a decade, doesn't Tesla have a moral obligation not to release it to preserve the perception of safety of self driving technology?

Nomentatus
You can get to this conclusion if you're sure Andrej is lying, and that the risks cited are smoke; but only then. BTW, I've upgraded my sneakers after a couple falls on a rough beach with tangled driftwood (drift trees, really) proved their cheap too-slick surface had real world consequences. I was lucky not to break a bone. I'm going to bet he isn't lying, but I can understand someone making the opposite bet, market competition being market competition.
cbsmith
I don't think he's lying. I think he's looking at things from a particular perspective, and that perspective is being critiqued, as it should be.
Nomentatus
True, he could be wildly misled but he's been around doing this for a while, so that seems unlikely. He could be truly delusional in either case but it's still kinda necessary to knock down his arguments or logic; and that's the critique or analysis I'm not seeing. Just assertions again and again that the real reason is economics triumphing over safety. I'm beginning to think that the idea that there are genuine trade-offs in life is just ungraspable or offensive to many.
cbsmith
There's a lack of evidence either way, which really should tell you all you need to know. I don't think they're delusional, but they are constrained by their context.

Yes, with ML models, you often can be better off trimming down your sensor data. Usually though, you don't remove entire categories of sensor data. Even when you do, to be confident in such a move, you need to first achieve a working model with whatever data you have, and then through refinement you can prove that whole categories of data are more hinderance than help. They haven't done that.

It seems quite clear that the reality is a full sensor array is just economically non-viable with their business model, and that's framing their whole thought process.

Because they promised FSD back in 2017, they can't acknowledge that going without those sensors means it's going to take them much longer to achieve FSD. Because of safety/regulatory oversight, they also can't acknowledge that going without those sensors means there will be additional safety risk.

So they're stuck making these rationalizations that everyone in the industry knows are at best half-truths. No doubt, at some point we'll figure out how to do self driving with a much more limited sensor package, and when we do, we'll achieve a significant improvement in the cost effectiveness of self-driving.

In the meantime, there's a lot of "rationalization" going on.

3apo
I feel the other reason is that Tesla has not figured out a way to put Radar into their ML pipeline. If you take the Range-Doppler Map from the radar as the 'pixel' map, that data is inherently very dependent on the scenario and the radar sensor intrinsic parameters. This variability in what the radar sees in the RD space is what makes this a challenge for ML/AI pipelines. If Tesla were to 'fuse' information from these sensors in the object track level - I believe they will be less susceptible to this variability.
touch_abs
Its interesting, that kind of object level fusion is a fairly different problem to training visual perception, following some of the less in fashion robotics techniques. I wonder if its a case of the Tesla engineers focusing on the fad technologies (or just their strengths) more than its a hardware cost thing.
dreamcompiler
Exactly. Radar gives you direct range data; camera pixels need to be processed by ML to infer range data, and the latter is never going to be as close to ground truth as the former, so the former should be prioritized.
Nomentatus
Not quite. Light waves are so short you'll get some return from almost any surface, because the surface is rough at the scale of such a small wavelength. This isn't true of radar, and it's not just what substance the outbound radar hits but how flat it is, too. You may get no return. Or almost none. Even smooth, round steel posts give very little return IIRC. There's also an echo problem with long waves such as sound and radar, particularly in urban areas. In which case what you think is a firm direct return may be a very indirect return that happens to be in synch with the signal you were expecting.
steve_adams_86
That’s one way to read it, but in my own experience the “do one thing really well” approach can yield far better results. Meaning, if vision is truly sufficient and you do it really well rather than a bunch of sensors “okay”, that may actually be safer overall. You might get far more focused and practical results from your efforts.

I’m not saying this is definitely true, and at the moment we probably can’t verify it either. I’m just “steel manning” his case, as Lex loves to say.

I think you’re probably correct that the business aspect was a significant factor, but perhaps it wasn’t everything.

hammock
Devils advocate, if the cost of working to improve cameras to the point where they eliminate that delta is lower than the cost of using the sensors instead, then it is a net benefit
itsoktocry
>f the cost of working to improve cameras to the point where they eliminate that delta is lower than the cost of using the sensors instead

In a vacuum, how can cameras ever be better than cameras + other sensors?

kennyloginz
All forms of transportation could be made safer , if inside a vacuum and ignoring the economic reality ( cough, Boeing ).
hammock
When you factor costs. Which was Andrej's point, if you watched the video
tophi
Yes, the current delta was not massive and will shrink over time.

By getting rid of the extra sensors they eliminate a temporary crutch and focus resources on the simple solution.

Not a new concept by the way. Henry Ford was obsessed with simplifying and eliminating every part that wasn’t necessary on the model T for virtually all the same reasons.

7e
What crutch? What simplification? These sensors are widely deployed and have already been perfected. Systems which use only one modality are the crutch. Sensor fused systems will always be safer, and are the future.

This move is purely about screwing passenger safety for cost and sales.

GuB-42
The difference is that Ford started with something that worked. The Ford T is noteworthy because of the way it was made, not for its abilities as an automobile.

Tesla is starting with something that doesn't work. No one has been able to achieve full autonomy yet, not even Waymo on its own turf, despite Waymo being well ahead of Tesla. I trust Tesla will be able to close the gap and be able to perform to the its current standards without radar and ultrasound, and it would be fine if the current standards weren't terrible in the first place. What I mean is that Tesla is currently at the awkward spot where it is good enough for cruise control, but not good enough to safely take a nap in the driver seat.

As for the "simple solution", you may know the saying "For every complex problem there is an answer that is clear, simple, and wrong". I think it applies here.

adolph
> Regarding results, the best Andrej can do is, "In this case, we looked at using it and not using it, and the delta was not massive." In other words, the results are better, but not enough to make up for the fact that Telsa can't support additional sensors without incurring a prohibitive amount of additional risk to Tesla. Risk to passengers doesn't appear to be a consideration.

I think this mischaracterizes Andrej's response. If anything he is referring to a wholistic view of the vehicle, which includes but doesn't entirely consist of Tesla. For example, 5-10 years down the road, when sensors start going bad, consumers will appreciate fewer things to go wrong with a vehicle--that is one of the advantages of electric over ICE after all.

If anything this is an acknowledgement that George Hotz was right in focusing on optical sensors with Comma.ai.

pmarreck
Two thoughts:

1) He's not touching on the software cost of integrating different sensor data into the same trained machine learning model; it is likely far simpler to just stick to stereoscopic vision data (the same thing the human genome decided!)

2) That said, it seems at least theoretically advantageous to have a sensory system that exceeds that which humans are limited to; things like LIDAR can work in complete darkness and potentially spot, for example, pedestrians crossing a dark road without any reflective clothing on, where a vision-based system would fail (perhaps add infrared sensation?)

Anyway, doesn't AEB (automatic emergency braking) have to be installed in every car, by law, in the US, around now? And wouldn't that be less reliable if done via vision?

sfifs
> likely far simpler to just stick to stereoscopic vision data (the same thing the human genome decided!)

Yeah and till we had reliable and powerful artificial lighting, it was highly unsafe to journey in low visibility/ darkness. We used to finish journeys when darkness fell.

Animals that do require precise movement in low visibility (bats, dolphins) conditions often evolved ultrasound solutions.

So should we license Tesla vehicles to only operate when visibility and weather forecast is good and not drive in the dark at all?

jquery
Excellent points, I didn't think about the fact that even evolution couldn't come up with a vision system that works as well in the dark as it works in the daytime.
pmarreck
Well actually, the human vision system at night, while not as good as cats and perhaps dogs, is still much greater than any camera we've come up with thus far, I read something that claimed we can actually detect individual photons hitting our retina, once we are adapted to the darkness

https://www.science.org/content/article/human-eye-can-detect...

n0tth3dro1ds
>it is likely far simpler to just stick to stereoscopic vision data (the same thing the human genome decided!)

There’s a lot more to perception while driving than just stereoscopic vision.

First, your stereoscopic “cameras” (eyes) are mounted in free-rotating sockets, which are themselves mounted in a rotating and swiveling base (your head/neck). Your eyes can do rapid single-point autofocus better than any existing camera. They also have built in glare mitigations —- squinting, sunglasses, and sun visors. This system is way more advanced than fixed cameras. Yes, even an array of fixed cameras with a 360 degrees field of view.

Then you have your sense of touch, your hearing, and your equilibrio sense. You feel motion in the car. You feel vibrations in the pedals. You hear road noise, other cars, sirens, and the engine (not much in EVs). You smell weird smells and know when you’re driving with your e-brake on or when there’s a skunk nearby. There’s a lot getting fused with the vision to make it all happen, and I think you’d be surprised how “broken” your driving capabilities would be if you took one of these “background” senses out of the equation.

My anecdote: I drive a manual transmission car. A few months back, I woke up with no hearing in my right ear. Spooked, I drove to urgent care. I could not drive well at all —- I was holding low gears for way too long. I learned that I use hearing almost exclusively to know when to shift. If you had asked me beforehand, I probably would have said that I’m visually monitoring the tachometer to know when to shift. Not the case. Also, I had a TERRIBLE sense of my surroundings. As I drive, I’m definitely building a model of the environment around me based on road noise, sound from other cars, sirens, and the like. Without hearing in just one ear, I felt very disconnected and unsafe. Living in California where lanesplitting is legal, I had several motorcycles catch me completely off guard. I had my hearing restored at urgent care and everything went back to normal immediately on the drive home.

I think Andrej and Tesla massively overestimate vision’s sole ability to solve the problem. Humans are fusing lots of sensation to drive well.

pmarreck
Good points. What did your temporary deafness end up being, if I may ask?
n0tth3dro1ds
Water and ear wax. I went swimming the day before.
kennyloginz
Glad to hear it was temporary!
quonn
I think the key point he‘s trying to make is that the size of the fleet is more important than the quality of the sensor. The risk would be reduced by a better system and he seems to be convinced that rolling out vision to more and cheaper cars would get you there.
L0stLink
There is a great argument for having ultrasonic sensors and radar in a recent video by Rayan from FortNine discussing two fatal accidents involving tesla autopilot https://www.youtube.com/watch?v=yRdzIs4FJJg
ion_fury
lex's comment did not strike me as a dig. i am actually concerned by your comment because it makes me wonder if i am missing other things too? it just doesnt seem like a dig. it seems like he thought of something funny and wanted to share it. am i alone in this?

and also i dont understand your assertion that it was some kind of cynical maneuver to re-frame the question. he could have also said "yes, more sensors are always better but you can add an arbitrary number of sensors and so we had to decide where to draw the line. the cameras we use are capable of meeting our goal of full self driving that is significantly safer than a human driver. and this also streamlines the production and software which has a material impact on our ability to actually produce the cars which is of course necessary to meet the goal of making self driving cars. bloat could actually kill tesla."

this is logically the same thing that he said in the interview, so whats cynical about it? how is it underhanded?

also is there some intrinsic limitation of the dynamic range of cameras? people are talking about problems with dynamic range being intrinsic to cameras but im pretty sure that cameras and especially camera suites that do not have more problems with dynamic range than a human eye are possible to make and probably already on the market.

JumpCrisscross
> did not strike me as a dig

It wasn't a dig. It was calling out a bullshit move that, in my opinion, Andrej deployed out of panic more than strategically. (My evidence for this being Andrej eventually gave a good answer.)

ion_fury
i just dont get it. lex says "lets re-frame the question: can a language model drive a car?" this doesnt have any insinuations about andrej's intentions or motives. if it were calling out bullshit it would be "lets re-frame the question: why is elon musk never wrong?"

lex kisses elon musks ass, last time i checked hes on musks side in the lidar debate and also lex has a record of listening patiently no matter what and i have never known him to check people or "call out their bullshit." lastly, what andrej did wasnt a bullshit move. re-framing a question has never been known as something people do only when they are being deceptive and it is very common in intellectually honest answers/explanations in my experience.

im still shocked because usually i see what other people see. but people calling this a dig/bullshit came out of left field for me. i hope i dont miss stuff like that more than i realize...

edit: i read your comment again and i didnt read it right. he panics, gives himself some more room, lex acknowledges this but in a prickly way. it makes sense but im still upset because when i watch it all i see is lex making a joke. i guess i have severe autism.

None
None
jholman
See my sibling comment, but I agree with you, especially about Lex's track record, and I think the other commenters are projecting.

And you're right, reframing a question is like THE MOVE for academics, so it's completely not evidence that Andrej was making "a bullshit move" (although it's also a classic move for PR flaks, so it's possible he was making a bullshit move).

ion_fury
i looked at your account and, were you ever involved in real estate?
jholman
I've read the fine-print terms of a mortgage perhaps half a dozen times.
jholman
I don't think it was calling out a bullshit move (and definitely not a dig)

a) Saying "I wonder when language models will do this" is a total Lex thing to say. That's what he's into.

b) Lex is almost always a softball interviewer, though one with interestingly deep knowledge. He interviews experts, and he errs on the side of respect. If you're looking for hardball, don't listen to Lex, it's not what he does. He almost never calls out bullshit, and he especially doesn't call out evasion.

Now, it's still possible that it was a panic-deployed evasion on Andrej's part, and whether or not that was his mentality, I agree that he gained nothing by it, and did not in fact reframe the question at all.

tsimionescu
> also is there some intrinsic limitation of the dynamic range of cameras? people are talking about problems with dynamic range being intrinsic to cameras but im pretty sure that cameras and especially camera suites that do not have more problems with dynamic range than a human eye are possible to make and probably already on the market.

I think it's possible that professional movie cameras (with the appropriate lenses) may have higher dynamic range than human vision. Good luck getting those cheaper than a lidar.

ion_fury
just take some cameras, set each one at a different exposure via something like smoked glass and composite the images in real time. i dont know, it just seems pretty easy.
tsimionescu
Well, that increases the expenses from the sensors several fold (at least 2x of course, but I would guess closer to 3x-4x, depending on how many you need to cover the whole range).
Nomentatus
Andrej did get around to answering the original question, he just wanted to say more, to put it into a bigger frame with more context. I had the same "weasling" concern at first; but his answer was more or less "You lose more than you gain, but yes there was a small delta; in exchange for which any organization would take on not just an economic hit but a lot of additional opportunities for process and maintenance errors; plus distracting the team." So he'll agree that in an ideal world you'd want 'em, just not want 'em that much; but in the real world, more geegaws that aren't really pulling their weight are a terrible idea.

Although he didn't explicitly say so, neither his answer nor Elon's "take it out 'cause you can always put it back in if it turns out you really need it" philosophy absolutely rule out lidar coming back in the future if some remaining edge case just requires it. Clearly he thinks this is quite unlikely, however.

strangescript
You are making a lot of potentially faulty assumptions. 1) The "delta" was wide enough to save/harm people, you have no idea. 2) The extra information provided would always be valuable and/or not be overcome with better AI models using the visual sensors in the future. 3) The amount of technical overhead generated by the extra sensors were not prohibitive long term. When working with AI there are often times where it would seem logical that extra relevant data will always improve a model, but that turns out not to always be the case, or provide so little value that managing another dataset is just not worth it.
asah
Risk to PEDESTRIANS even less.
aaron695
None
patrec
> I thought it was telling that Andrej immediately "reframed" the question because Lex asked the "wrong question". This is a classic evasion technique

I agree with this assessment. However:

> Telsa can't support additional sensors without incurring a prohibitive amount of additional risk to Tesla. Risk to passengers doesn't appear to be a consideration.

This is a stupidifying take. Of course when you work in a line of business producing gadgets that, as an unintended side-effect, kill a lot of people (napkin math suggests above 2 milli-kills per car in the US), you will need to pick a point at which you say further fatality reduction is no longer justified given the economic cost of achieving it. Even if you are a pure altruist (if you go out of business, less safe cars will replace yours). Conversely, even if you are the embodiment of capitalist evil, risks to passengers will absolutely affect your bottom line and if are rational you will take them into consideration. Any meaningful criticism needs to be about the trade-offs they make, not that they make them or are loath to explicitly say so on camera.

ClumsyPilot
> risks to passengers will absolutely affect your bottom line and if are rational you will take them into consideration

percieved, not real risks to customers. PR matters more than reality

patrec
It is kind of hard to control perceived risks to passengers without taking actual risks to them into account.
bumby
I don’t necessarily disagree that the tradeoffs are the important part. But it begs the question about who sets the thresholds for acceptable risk. With public safety, this is usually the govt regulatory body but I don’t think we’re there yet. Meaning your conclusion may be premature.
CharlesW
> …you will need to pick a point at which you say further fatality reduction is no longer justified given the economic cost of achieving it.

You're right — the sad truth is that corporations put costs on human lives every day. Where I think we disagree is that you believe they made the decision based primarily on costs. After watching this video, I believe they made the decision because they didn't think they could reliably implement and support a sensor fusion approach.

(BTW, I enjoyed "stupidifying"! I'm sorry I made people stupider.)

snotrockets
Tesla hasn't proven itself to be a capable major car manufacturer (they probably lead the minor category, at least in deliveries) in all but one: their de-prioritizing of human life.
kotlin2
Is there hard data on how deadly they are vs. other auto manufacturers? There is definitely a narrative that the cars are dangerous, but I'd like to see that quantified.
xodjmk
The majority of comments surrounding stereo-camera/lidar questions have a ridiculously simplified idea of the problem. It's obviously the case that 'more sensor good, less sensor bad'. This is frankenstein's monster level technical analysis. Why don't the majority of large-brained animals have many eyes, and many different antannea appendages processing an array of diverse sensory input? You don't just automatically gain by adding lots of sensors. The signals have to be fused together and reliably cooperate and come to agreement in real time for any decision. Any sensor is only providing raw crude data. The majority of the work involved is done by processing this crude data and inferring a much more sophisticated approximation of the real environment from prior knowledge, hence using neural nets with pre-trained data. It is a good debate whether the approximation can be done better by adding more sensor input and diverting R&D & processing resources towards fusion as opposed to improving the results that can be obtained from stereo image sensor. It's not obvious to anyone. And nature seems to inform us that most large brain animals evolve to rely heavily on two eyes instead of 16 eyes + lasers. This is an interesting discussion, but the issue isn't 'tesla could just bolt a Lidar box to the roof and magic, but they want to scam you out of a few extra bucks'. That is a moronic idea.
wonnage
The issue here is trying to infer distance based on complex image processing or just… measuring the damn distances.
xodjmk
No, that is over simplification. It's not simply distances, you have to detect objects, motion, shadows, corners and all sorts of phenomena.
JumpCrisscross
> Why don't the majority of large-brained animals have many eyes

Because of the cost of additional eyes. If Tesla is optimizing for cost against safety, that's sort of the point.

I don't believe that's totally the case. Andrej later makes a better argument regarding limited R&D bandwidth, noise and entropy. But the "I would almost reframe the question" evasion was disconcerting. It's a textbook media trained tactic for avoiding a question to which you have no good answer. That it was deployed here badly against a skilled interviewer such that it backfired is a valid observation.

xodjmk
Everyone has limited R&D and processing bandwidth. It's not just Andrej saying this, but anyone working on engineering autonomous vehicles. It has nothing to do with the cost of additional eyes. This is over-simplifying the problem. Our eyes don't work that way. The data coming from eyes and image sensors is very crude and relies on either your brain or very sophisticated post-sensor processing to construct a 3D approximation of the actual environment. The sensors themselves don't provide this information. They don't distinguish distinct objects, corners, shadows vs. changes in color, or all sorts of phenomena that doesn't actually exist in the sensor data. This has to be inferred later by a brain or processing that relies on prior assumptions and 'training' by previous experiences with 3D environments. I don't really care what Tesla is doing vs. what other companies are doing, but these 'cost cutting' arguments don't matter. I would suspect that the R&D invested into the Machine Learning infrastructure, and the custom IC, and the software engineering out-weighs whatever amount they could save by removing a small sensor. And I don't believe that this guy Andrej is conspiring to squeeze a few bucks out of his customers at the expense of degrading his life's work. He is not trying to sell mops.
flashgordon
That was hilarious. Basically (unless this needs a reframing/realignment/repositioning/reorienting):

Q: "are less sensors less safe/effective?"

A: "well more sensors are costly to the organization and add more tech debt so safety is orthogonal and not worth answering".

judge2020
Munro’s cost breakdown is much more informative in just how much it’ll save in terms of parts/labor. https://youtu.be/LS3Vk0NPFDE

In general the ‘harm to consumers’ is really just making it more likely they damage the car in a parking lot or their garage, which tells you where their priorities are (sales, Automotive gross profit). Assuming occupancy network works, the only real blind spot left is if something in front of the car changes in between it turning off and on (assuming occupancy will 'remember' the map around it when it goes to sleep).

Also, Tesla’s strategy for safety is seemingly “excel in industry standard tests, ie. IIHS and EuroNCAP”, so this might be a case of the measure becoming a target.

stefan_
This thread is unhelpfully mixing radar and ultrasonic sensors. Ultrasonic sensors, as your video explains, are primarily used as a parking aid; they are tuned for too low a distance to be helpful in just about any kind of driving scenario at speed.

Meanwhile, radar is the principal sensor used in systems like automatic emergency braking across the industry. It has no intersection with any of the parking stuff because it generally has to ignore stationary objects to be useful (hence the whole "Teslas crashing full speed into stopped vehicles" thing).

abracadaniel
The kicker for me is that the area covered by the ultrasonic sensors is essentially all blind spots for the cameras. The sensors currently are able to tell you when something too low to see is getting within a few inches of the car. It also gives an exact distance when parking, so I can know that I'm parking exactly 2ft from the wall every time. As much as they claim otherwise, it simply cannot be a matter of fixing it in software. The cameras can't tell you what they can't see. They simply don't have the coverage to do this, and clearly don't even have the coverage to hit parity with radar enabled autopilot either.
RC_ITR
The first famous autopilot crash was because a white semi-truck was washed out by the sun and confused for an overhead sign.

That's literally trivial for a car with radar to detect.

Amazing how people talk about stuff they have no idea about when it comes to Tesla.

eightysixfour
Not a fan of Tesla removing the sensors but a vehicle on a highway that isn’t moving the same direction as the car is not “trivial” with radar. No AEBs that use radar look for completely stopped objects after a certain speed because the number of false positives is so high.
LightG
Which is another reason that "full" FSD is at least a decade away and likely from another supplier, if at all. Cannot believe people are funding this.
RC_ITR
So, yes, cars that are programmed to have AEB: perform well at AEB and not other tasks. We are in agreement here. (I even agree with you that those cars use Radar for AEB).

Now, where we disagree is you implying that cars with AEB-level radar (literally $10 off-the-shelf parts with whatever sensor fusion some MobilEye intern dreams ups) are somehow the same as self-driving cars (the goal of Tesla Autopilot).

Every serious self-driving car/tractor-trailer out there uses radar as a component of its sensor stack because Lidar and simple imaging is not sufficient.

And that's the point I was trying to make - we agree it's trivial for radar to find things they just need sensor fusion to confirm the finding and begin motion planning. This is why a real driverless car is hard despite what Elon would like you to believe. There is no one sensor that will do it. Full stop.

And this cuts to the core of why Tesla is so dangerous. They are making a car with AEB and lane-keeping and moving the goal posts to make people (you included) think that's somehow a sane approach to driverless cars.

kalleboo
But if the radar just sees a static object and can't tell if it's an overhead sign or a car, and the camera vision is too washed out, how would sensor fusion help in your example?
pcl
A human driver slows down and moved their head around to get a better view when the glare from the sun is too strong to see well. I’d expect a self driving car to similarly compromise on speed for the sake of safety, when presented with uncertainty.
cbsmith
Lidar would make it pretty obvious whether it's a sign or a car, even if the camera didn't tell you. The part where the lidar doesn't bounce back at vehicle level would be a dead give away.
robertlagrant
A approaching incline has entered the chat.
cbsmith
Approaching inclines aren't exactly hard to interpret from lidar either.
sangnoir
Perhaps stop cheaping out on the cameras and procure those with high dynamic range. Then again those may be "expensive and complicate the supply chain with for a small delta"
kalleboo
But then, again, what is the point of the radar?
Reason077
> ”There is no one sensor that will do it. Full stop.”

Yet somehow, humans can drive cars with just a pair of optical sensors (mounted on a swivelling gimbal, of sorts).

In theory, a sufficiently capable AI should be able to drive a car at least as well as a human can using the same input: vision.

tsimionescu
The optical sensors are just a small part of the human (and animal in general) vision system. A much bigger component is our innate (evolutionarily acquired) understanding of basic mechanics, simple agent theory, and object recognition.

When we look at the road, we recognize stuff in the images we get as objects, and then most of the work is done by us applying basic logic in terms of those objects - that car is off the side of the road so it's stationary; that color change is due to a police light, not a change in the composition of objects; that small blob is a normal-size far-away car, not a small and near car; that thing on the road is a shadow, not a car, since I can tell that the overpass is casting it and it aligns with other shadows.

All of these things are not relying on optics for interpreting the received image (though effects such as parallax do play a role as well, it is actually quite minimal), they are interpreting the image at a slightly higher level of abstraction by applying some assumptions and heuristics that evolution has "found".

Without these assumptions, there simply isn't enough information in an image, even with the best possible camera, to interpret the needed details.

Reason077
> "A much bigger component is our innate (evolutionarily acquired) understanding of basic mechanics, simple agent theory, and object recognition. ... they are interpreting the image at a slightly higher level of abstraction by applying some assumptions and heuristics that evolution has "found"."

Of course, and all this is exactly what self-driving AIs are attempting to implement. Things like object recognition and understanding basic physics are already well-solved problems. Higher-level problem-solving and reasoning about / predicting behaviour of the objects you can see is harder, but (presumably) AI will get there some day.

tsimionescu
Putting all of these together amounts to building AGI. While I do believe that we will have that one day, I have a very hard time imagining as the quickest path to self-driving.

Basically my contention is that vision-only is being touted as the more focused path to self-driving, when in fact vision-only clearly requires a big portion at least of an AGI. I think it's pretty clear this currently means this is not a realistic path to self-driving, while other paths to self-driving using more specialized sensors seem more likely to bear fruit in the near term.

CyanBird
> sufficiently capable AI

And Tesla lacks that, so therefore they ought not simply rely on cameras and ought use extra auxiliary systems to avoid danger to their consumers, they are not doing this because it reduces their profit margins, alas, this hn thread

cbsmith
> Yet somehow, humans can drive cars with just a pair of optical sensors (mounted on a swivelling gimbal, of sorts).

In fairness, humans have a lot more than just optical sensors at their disposal, and are pretty terrible drivers. We've added all kinds of safety features to cars and roads to try to compensate for their weaknesses, and it certainly helps, but they still make mistakes with alarming regularity, and they crash all the time.

When you have a human driver, conversations about safety and sensor information seem so straightforward. The idea of a car maker saving a buck by foregoing some tool or technology at the expense of safety is largely a non-starter.

What's weird is, with a computer driver, (which has unique advantages and disadvantages as compared to a human driver) the conversation is somehow entirely different.

tibbydudeza
My kid took 12 driving lessons of an hour long before she could drive stick shift on a road albeit very slowly.
camgunz
> We've added all kinds of safety features to cars and roads to try to compensate for their weaknesses

This is a super important point. Whenever self-driving cars comes up in conversation it's like, "we're spending billions of dollars on self-driving cars tech, but what if we just, idk, had rails instead of roads". We're putting all the complexity on the self-driving tech, but it seems pretty clear that if we helped a little on the other end (made driving easier for computers), everything would get better a lot faster.

codeflo
> In theory, a sufficiently capable AI should be able to drive a car at least as well as a human can using the same input: vision.

In theory, cars should be use mechanical legs instead of wheels for transportation, that's how animals do it. In theory, plane wings should flap around, that's the way birds do it. My point being: the way biology solved something may not always be the best way to do it with technology.

dotancohen
GP was stating that "two cameras mounted 15cm apart on a swivel slightly left of the vehicle center of geometry" has proven to be a _sufficient_ solution, not necessarily the best solution.
Reason077
> ”In theory, cars should be use mechanical legs instead of wheels for transportation, that's how animals do it.”

Wheels and legs solve different problems. Wheels aren’t very useful without perfectly smooth surfaces to run them on. If roads were a natural phenomenon that had existed millions of years ago, then isn’t it plausible that some animals might have evolved wheels to move around faster and more efficiently?

2muchcoffeeman
>Yet somehow, humans can drive cars with just a pair of optical sensors (mounted on a swivelling gimbal, of sorts).

This is wrong and I was surprised to hear them say it was enough in the video.

We don't have car horns and sirens for your eyes. You will often hear something long before you see it. This is important for emergency vehicles. Once you hear it, a good driver will immediately slow down and pull to the side, or delay movement to give space for the vehicle.

Does this mean self driving vehicles can't detect emergency vehicles until they appear on camera? That's not encouraging.

gizmo
Deaf people (or those who blast music) can drive. People who are blind in one eye can drive.
mrguyorama
That might have something to do with the general intelligence prediction supercomputer sitting between the ears. If Tesla is saying they won't have real (not just an 80 percent solution that they then lie and say is complete) self driving until they develop an AGI, I agree
dsfyu404ed
>Once you hear it, a good driver will immediately slow down and pull to the side, or delay movement to give space for the vehicle.

Robotically performing an action in response to single/few stimuli with little consideration for the rest of the setting and whether other responses could yield more optimal results precludes one from ever being a "good" driver IMO.

"See lights, pull over" is not going to cut it. See any low effort "idiot drivers and emergency vehicles" type youtube compilation for examples of why these sorts of approaches fall short.

naijaboiler
Optical sensors, an innate understanding of they works around them that they are previewing. And most importantly, a social understanding of what other humans around them are likely to do.
mrguyorama
Our two eyeballs plus brain is SO MUCH MORE than just two mediocre CCDs.

Our eyes provide distance sensing through focusing, the difference in angle of your two eyes looking at a distant object, and other inputs, as well as having incredible range of sensitivity, including a special high contrast mode just for night driving. This incredibly, literally unmatched camera subsystem is then fed into the single best future prediction machine that has ever existed. This machine has a powerful understanding of what things are (classification) and how the world works (simulation) and even physics. This system works to predict and respond to future, currently unseen dangers, and also pick out fast moving objects.

Two off the shelf digital image sensors WILL NEVER REPLACE ALL OF THAT. There's literally not enough input. Binocular "vision" with shitty digital image sensors is not enough.

Humans are stupidly good at driving. Pretty much the only serious accidents nowadays are ones where people turn off some of their sensors (look away from the road at something else, or drugs and alcohol) or turn off their brain (distractions, drugs and alcohol, and sleeping at the wheel).

bonestamp2
Yes, a "pair" of optical sensors. Tesla is at a disadvantage compared to humans -- they do not do stereoscopic imaging, which makes distance of objects less reliable -- they try to infer distance from a single flat image. Humans having two sensors pointed in the same direction gives us a very reliable way of determining distance (up to a relevant distance for driving at least).
lolc
Interestingly, even people with missing stereoscopic vision are allowed to drive. We don't require depth perception to drive. The assumption is that they can compensate.
mrguyorama
Binocular vision isn't even the only source of depth information available to humans. That's why someone missing an eye can still make reasonable depth estimations.
cbsmith
Yes, and a sufficiently smart compiler should make optimization unnecessary. ;-)
sangnoir
Isn't this a bit like saying we can do better than fixed-wing aircraft, because birds can flap their wings? With sufficiently advanced material science, flapping-wing human flight too, is possible. But that doesn't mean Boeing and Cessna are misguided.
michael1999
But that's not how people drive. They use their ears, they move their head around to generate parallax, they read the body-language of other drivers, they make eye-contact at intersections, they shift position to look around pillars, or stoop to see an inconveniently placed stop light. Fixed forward cameras do none of that.
freejazz
Yes, in theory, something sufficient will suffice. I think we are trying to bring some clarity to the topic, however.
fennecfoxy
I believe in the sentiment, but it's true that we humans also crash our cars a LOT. From minor bumps and scrapes to multiple fatalities.
tibbydudeza
Running neural networks tweaked by million years of evolution.
RC_ITR
You seem to be fine with the 30k Americans per year who die in car crashes.

That number is probably too high for robots to do though.

Humans are weird like that.

numpad0
> humans can drive cars

Needs emphasis on can

TulliusCicero
> Yet somehow, humans can drive cars with just a pair of optical sensors

A pair of optical sensors and a compute engine vastly superior to anything that we will have in the near future for self-driving cars.

Humans can do fine with driving on just a couple of cameras because we have an excellent mental model (at least when not distracted, tired, drunk, etc.). Cars won't have that solid of a mental model for a long, long time, so sensor superiority is a way to compensate for that.

kalleboo
This is a common failure mode for radar-based systems, since they can't tell the difference between a stopped car and say, an overhead road sign on a hill crest

https://www.wired.com/story/tesla-autopilot-why-crash-radar/

> Volvo's semiautonomous system, Pilot Assist, has the same shortcoming. Say the car in front of the Volvo changes lanes or turns off the road, leaving nothing between the Volvo and a stopped car. "Pilot Assist will ignore the stationary vehicle and instead accelerate to the stored speed," Volvo's manual reads, meaning the cruise speed the driver punched in. "The driver must then intervene and apply the brakes.” In other words, your Volvo won't brake to avoid hitting a stopped car that suddenly appears up ahead. It might even accelerate towards it.

stillworks

  That's literally trivial for a car with radar to detect.
In principle that is correct… but radars in automotive application are unable (or rather not used) to detect non-moving targets ?

Asking this because I know first hand that the adaptive cruise function in my car must have a moving vehicle in front of it for the adaptive aspect to work. It will not detect a vehicle that is already stopped.

The resolution of the radar is pretty good though, even if the vehicle in the front is just merely creeping off breaks… it does get detected if it is at or more than the “cruising distance” set up initially.

The AEB function on my car depends on the camera.

cbsmith
Adaptive cruise control is solving a totally different problem. It is specifically looking for moving objects to match pace with. That's very different from autonomous driving systems.

Radar is quite good at finding stationary metal objects, particularly. Putting it in a car, if anything, helps, because the station objects are more likely to be moving relative to the car...

ra7
The newer “4D imaging” radars can track stationary objects too. This is what the self driving companies have turned to in recent times.
tim-fan
My understanding is that your typical automotive radar will have insufficient angular resolution to reliably distinguish, say, an overpass from a semi blocking the road, or a pedestrian standing in the middle of the road from one on the footpath.

Radar does however have the advantage of measuring object speed directly via the doppler effect, so you can filter out all stationary objects reliably, then assume that all moving objects are on the road in front of you and need to be reacted/responded to.

So I think it's the case that radar can detect stationary objects easily, but cannot determine their position enough to be useful, hence in practice stationary objects are ignored.

leetharris
What a smug post from someone who is completely wrong.

The car that crashed had radar, vision, USS, AND it was based on another company's technology.

Reason077
> ”The first famous autopilot crash was because a white semi-truck was washed out by the sun and confused for an overhead sign.

That's literally trivial for a car with radar to detect.”

That crash occurred on a car which was using radar. Automotive radar generally doesn’t help to detect stationary objects.

Further, that crash occurred on a vehicle with the original autopilot version (AP1), which was based on Mobileye technology with Tesla’s autopilot software layered on top. Detection capabilities would have been similar to any vehicle using Mobileye for AEB at the time.

touisteur
I find very strange the claim that a moving doppler (pulsed doppler?) radar 'generally doesn't help to detect stationary objects'. I mean if the car is moving, it generates a doppler shift on all objects moving at a different speed, right?

Maybe it's difficult for reasons of false alarm detection (too many stationary objects that are not of interest) but you can get very good results with tracking (curious about these radars' refresh rate), STAP, and classification/identification algorithms, especially if you have a somewhat modern beamformed signal (so, some kind instant spatial information). Active-tracking can also be of help here if you can beamsteer (put more energy, more waveform diversity on the target, increase the refresh rate). Can't these radars do any of those 'state of the art 20 years ago' stuff?

There's something I don't get here and I feel I need some education...

michaelt
I've heard people on the internet claim that, in automotive radar the first thing they do when processing the signal is discard any stationary objects. Apparently this is because the vast majority of the time it's a sign or overhead gantry or guard rail - any of which could plausibly be very close to the lane of travel thousands of times per journey - and radar doesn't provide enough angular resolution to tell the difference.

Personally I've never seen these claims come from the mouth of an automotive radar expert, and many cars do use radar in their adaptive cruise control, so I present it as a rumour, not a fact :)

mrguyorama
Indeed, my VW which uses a forward looking radar has signaled several times for stationary objects. In fact, the one time it literally stopped an accident was for a highway that suddenly turned into a parking lot. People keep repeating BS said by tesla and tesla apologists for why their cars run into stopped things and others seem to have less of a problem with it.
Someone
> I find very strange the claim that a moving doppler (pulsed doppler?) radar 'generally doesn't help to detect stationary objects'. I mean if the car is moving, it generates a doppler shift on all objects moving at a different speed, right?

I’m in the same boat as to not understanding why, but from what I have read the problem indeed isn’t that it doesn’t detect them, it’s that there are too many of them, and nobody has figured out how to filter out the 99+% of signals you have to ignore from the ones that may pose a risk, if it’s doable at all.

I think that at last part of the reason is that spatial resolution of radar isn’t great, making it hard to discriminate between stationary objects in your path and those close to it (parked cars, traffic signs, etc). Also, some small objects in your path that should be ignored such as soda cans with just the ‘right’ orientation can have large radar reflections.

RF_Savage
Especially when most car radars are FMCW radars. They not only do know the speed, they also know the distance.

Some of the newest car radars can do some beam formimg, but not all.

Most models have multiple radars pointing in multiple directions as that's cheaper than AESA.

Only just recently have "affordable" beamformer's come to the market. And those target 5G basestations.

So the spec in most K/Ka-band models starts at 24.250GHz, where the 5G band starts. While the licence free 24GHz band that the radars use is 24.000-24.250GHz.

If this was not bad enough there has been consistent push from regulators to get the car radars on the less congested 77GHz band. And there's even less afforable beamformers for that band.

touisteur
Might be time for some state sponsorship to have the beamforming asics, fpga designs for these bands. Although I might be missing something: once you're back down in your demodulated sampling frequency, your old beamformer should suffice? Or are we talking 'adc+demodulator+filter+beamforming' asic?
snovv_crash
Source: have worked with some of the (admittedly last-gen) automotive RADAR chips, NXP in particular.

The issue is the number of false positives, stationary objects need to be filtered out. Something like a drainage grill on the street generates extremely strong returns. RADAR isn't high enough resolution to differentiate the size of something, you only have ~10 degree resolution, and after that you need to go by strength of the returned signal. So there's no way to differentiate a bridge girder or a railing or a handful of loose change on the road from a stationary vehicle. On the other hand, if you have a moving object, RADAR is really good at identifying it and doing adaptive cruise control etc.

Edit: it looks like some of the latest Bosch systems have much better performance in terms of resolution and separability: https://www.bosch-mobility-solutions.com/media/global/produc...

touisteur
Hi, thanks. Thought it might be so. Still...

RADAR can have high(er) angular resolution with (e.g.) phased arrays (linear or not) and digital beamforming. I guess it's the way the industry works and it wants small cheap composable parts, but using the full width of the car for a sensor array you could get amazing angular accuracy, even with cheap simple antennas. MIMO is also supposed to give somewhat better angular accuracy, since you can perform actual monopulse angular measurement (as if you had several independent antennas). There's even recent work on instant angular speed measurement through interferometry if you have the original signals from your array.

And with the wavelengths used in car RADARs you could get far down on range resolution, especially with the recent progress on ADCs and antenna tech.

I'm not saying you're wrong, you're describing what's available today (thanks for that).

Wondering when all this (not so new) tech might trickle down to the automotive industry... And whether there's interest (looking at big fancy manufacturers forgoing radar isn't encouraging there).

MichaelZuo
There's also a legitimate harm to consumers with such a large radar array in the front bumper. Because even a minor fender bender could total a $50k car.

So the car would be very difficult to sell since few people are willing to pay much higher insurance premiums just for that.

touisteur
Ah thanks, didn't take that into account.
snovv_crash
In theory a big phased array of cheap antennas is cheap, in practise not because you need to have equal impedance routing to all of the antennas, which means you need them all to be roughly equidistant to the amplifier. You could probably get away with blowing it up to the size of a large dinner plate, but then you also need a super stiff substrate to avoid flexing, and you need to convince manufacturers that they should make space for this in their design language without any metallic paint or chromed elements in front.

Which car brand do you think would take up these restrictions, and which customer is then going to buy the car with the big ugly patch on the front?

touisteur
Thanks for the thoughtful reply.

Modern phased arrays can have independent transmitters (synchronized digitally or with digital signal distribution) or you can have one 'cheap and stupid' transmitter and many receivers, doing rx beamforming, and as for complexity you mostly 'just' need to synchronize them (precisely). The receivers can then be made on the very cheap and you need some signal distribution for a central signal processor.

Non-linear or sparse arrays are also now doable (if a bit tricky to calibrate) and remove the need for complete array or rigid substrate or structure.

If you imagine the car as a multistatic many-small-antennas system there's lots that could be done. Exploding the RADAR 'box' into its parts might make it all far more interesting.

I'll admit I'm way over my head on the industrial aspects, so thanks for the reality check. Just enthusiastic, the underlying radar tech has really matured but it's not easy to use if you still think of the radar as one box.

snovv_crash
I know even for the small patch antennas we were looking at, the design of the waveguides was insanely complicated. I can't imagine blowing it up to something larger with many more elements.

If you wanted separated components to group together many antennas I suspect the difficulty would be accurate clock synchronization what with automotive standards for wiring. I'm still not sure I understand how they can get away without having rigid structures for the antennas, but this would be a critical requirement because automotive frames flex during normal operation.

Cars are also quite noisy RF environments due to spark plugs.

I guess what you're speaking of will be the next 10-20 years of progress for RADAR systems as the engineering problems get chipped away at one at a time.

touisteur
Ah I'm probably oversimplifying and working in an industry with a far higher price per sold unit, so have a very distorted view of 'easy' or 'recent'.

Thanks for humouring me. RADAR is a very fun and interesting topic.

Retric
The video has a more reasonable answer.

The sensors are unreliable and expensive in terms of R&D. Having marginal parts which takes money from a finite R&D budget can easily result in a worse product. “They contribute noise and entropy into everything.” … “you’re investing fully into that [vision] and you can make that extremely good. You only have a finite amount of spend of focus across different facets of the system.”

His standpoint can be summed up as “I think some of the other companies are going to drop it.” Which would be really interesting if true.

lambdasquirrel
Didn't they used to talk about how the Tesla radar could actually see the reflections of the car ahead of the one just in front of you? i.e. the radar reflection bouncing underneath the car just in front of you?

This is what doesn't add up to me. Either a lot of that previous wonder-talk was actually a lie, or there's something else going on here.

hello639
> Either a lot of that previous wonder-talk was actually a lie, or there's something else going on here.

Tesla dropped radar and ultrasonic due to supply shortages. Nothing to do with their AI being smart.

Many first-hand reports on Tesla fanboy forums on how no-radar and no-ultrasonic autopilot is far worse than with the sensors.

a1369209993
> Having marginal parts which takes money from a finite R&D budget can easily result in a worse product.

"Less sensors can be more safe/effective if that allows us to focus on making effective use of the sensor information we do have, which is the result we're aiming for with this descision." would be a reasonable answer (if true), but that doesn't seem like a fair interpretation of what he actually said.

dmix
Isn’t that his point though? Where exactly is he not saying that?

I listened to his answer three times and I’m not able to come up with a different interpretation than that.

JumpCrisscross
> Isn’t that his point though? Where exactly is he not saying that?

Andrej eventually gets to it. But his first response was to evade. Lex is a skilled interviewer. By not letting him wriggle out of a difficult question we eventually got a substantive answer. But Andrej's first instinct was to evade. That's notable.

dmix
I don't agree Lex is a skilled interviewer, he's great at creating interesting conversations in the aw-shucks way Joe Rogan is, but he mostly plays a fanboy role. I still love a Lex interview.

Otherwise I agree.

JumpCrisscross
> don't agree Lex is a skill interviewer

Fair enough. Seemingly practiced may be a better description. (I'm not super familiar with his work.)

1123581321
That was exactly his point. I listened earlier today. (It being his point doesn’t mean he is necessarily right.)
Retric
It’s fairly rambling but he touches on that exact point several times most specifically here at the 2 minute mark:

“Organizationally it can be very distracting. If all you want to get to work is vision resources are on it and you’re actually making forward progress. That is the sensor with the most bandwidth the most constraints and you’re investing fully into that and you can make that extremely good. You only have a finite amount of spend of focus across different facets of the system.”

Which was from this section: Q: “Is it more bloat in the data engine?”

“100%” (Q:“is it a distraction?”) “These sensors can change over time.” “Suddenly you need to worry about it. And they will have different distributions. They contribute noise and entropy into everything and they bloat stuff”.

Even earlier he says:

“These sensors aren’t free…” list of reasons including “you have to fuse them into the system in some way. So that like bloats the organization” “The cost is high and you’re not particularly seeing it if your just a computer vision engineer and I am just trying to improve my network.”

seanmcdirmid
It is a hard question to answer. It’s like asking if more programmers on a project will allow it to be completed faster with higher quality. Ya, theoretically they could, in practice not likely. More sensors are like more programmers, theoretically they can be safer and more effective, but in practice they won’t be. Sensor fusion is as hard a problem as scaling up a software team.
ClumsyPilot
> More sensors are like more programmers

More programmers are like having more testicles, theoretically they should enable you to have more kids, but in practice bootleneck are elsewhere.

Both your and my reasoning by comparison is equally valid

seanmcdirmid
That isn't it though. It isn't like pumping a baby out in 1 month using 9 women. No, the problem is the fusion of too much information that varies substantially. They have completely different views of the world and you can't just lerp them together.

I bring up the programmers working on a project example just to illustrate how more isn't always better even if it theoretically can be.

atoav
I mean yeah, but it is a friggin ton heavy object moving at high speed controlled by a computer. Having another kind of sensor system to cross-check might be the reasonable thing to have, even if you happen to make it work well in 99% of the cases just with optics — the issue is that the other 1% kill people.

Your optical system can be good as heck till a bug hits it directly on the lense coving an important frontal area and make it behave weirdly.

underwater
Not more sensors, different sensors.

In your metaphor it's like asking if you should have project managers as well as engineers on your project. And Tesla has decided that having only engineers allows them to focusing on having the best engineers. And they avoid the distraction of having to manage different types of employees.

seanmcdirmid
Different sensors are even worse for sensor fusion. Actually, it only applies to different sensors, incorporating different signals with different strengths and weaknesses into a model that is actually better and not worse, is difficult.
oxfordmale
It is not a hard question to answer at all. LIDAR will make a self driving car safer, period. There is a lot of research to back this up.
seanmcdirmid
LIDAR can be safer than an optical system, I can believe that. LIDAR and an optical system being safer than either alone without a lot of extra complexity: maybe not.
aeternum
Lack of focus is a major problem for companies and we all know that tech debt leads to increased bug counts.

Team focus on vision which is by far the highest accuracy and bandwidth sensor allows for a faster rate of safety innovation given a constant team size.

syntaxing
Uhh highest accuracy and bandwidth for what? You can have a camera that can see piece of steak at 100K resolution at 1000 FPS but doesn’t mean you can use a camera to replace a thermometer. Blows my mind how people eat up that cameras can replace every sensor in existence without even entertaining basic physics. ML is not omnipotent.
et2o
For the specific task of (for example) cooking a steak it’s not hard to envision a computer vision algorithm coupled with a model with a some basic knowledge of the system (ambient temperature, oven/stove temperature, time cooking, etc.) doing an excellent job.
katbyte
You cannot use vision to see the state of the side of a steak touching the pan, nor the internal temperature.
krapht
No, I can't envision this. Surface texture alone will not tell you if meat is cooked. There is no getting around the temperature probe.

Now, simple color matching models are used in some fancy toasters on white bread to determine brownness. That's the most I've ever seen in appliances...

jeffbee
Tesla engineers are currently doing post-commit review of Twitter source code. Focus is the last thing I would credit them with.
mensetmanusman
All of them?
jfabre
do you have data to back this claim?
mynameisvlad
Have you been keeping up with the Twitter deal? This was covered Friday.

https://www.bloomberg.com/news/articles/2022-10-27/tesla-eng...

And the corresponding HN thread: https://news.ycombinator.com/item?id=33365065

jfabre
I have not been following the whole thing super closely, no. Thank you for the links.
eigenrick
I don't think it was your intent, but your statement makes it seems like all Tesla engineers are looking at Twitter code. I bet this number is closer to 4.

Tesla has ca. 1000 software engineers working in various capacities. The ca. 300 that work on car firmware and autonomous driving are probably not participating in the Twitter drama.

ClumsyPilot
4 people is enough to review source that was written by thousands of engineers at twitter? thats not even enough for architecture review
eigenrick
I don't think the goal is to review all Twitter source. That should be the job of the (new?) development team. I think the goal was to look at the last 6 months of code, especially the last few weeks, for anything devious.
freejazz
> "Team focus on vision which is by far the highest accuracy and bandwidth sensor allows for a faster rate of safety innovation given a constant team size."

By hiding the ball that you are starting from a much more unsafe position

ClumsyPilot
> vision which is by far the highest accuracy and babdwidth

They are literally the least accurate of all sensors.

Radar tells you distance and velocity of each object. Lidar tells you size and distance of each object. Ultrasonic tells you distance. Cameras? They tell you nothing!

Everything has to be inferred. Have you tried image recognition algorythms? I can recognise a dog from 6 pixels, the image recognition needs hundreds, and has colossal failures.

We have no grip on the results AI will produce and no grasp on it's spectacular failures.

Driving will have to be solved without AI

YZF
Tesla's cameras often get blocked by rain or blinded by the sun or not see that well in the dark. It's really hard to imagine those cameras replacing the ultrasonic sensors which do a pretty good job at telling you where you are when you're parking etc. I can't see how the camera is going to detect an object at pitch dark and estimate the distance to it better than an ultrasonic sensor. But hey, if people ding their cars it's more revenue.

The bottom line seems to be that the part shortages would have slowed production and cost cutting. The rest of the story seems like a fable to me. It was pretty clear Tesla removed the radar because it couldn't get enough radars.

The interview didn't really impress me. I'm sure Andrej is bound by NDA and not wanting to sour his relationship with Tesla/Elon but a lot of the answers were weak. (On Tesla and some of the other topics, like AGI).

sumedh
> I can't see how the camera is going to detect an object at pitch dark

Lights?

Eric_WVGG
I just assumed they used something similar to iPhone FaceID and Xbox Kinect dot emitters. https://www.theverge.com/circuitbreaker/2017/9/17/16315510/i...
YZF
A car typically doesn't have lights shining in all directions. My Tesla doesn't at an rate. At night, backing into my driveway, I can barely see anything on the back-up camera unless the brake lights come on. If it's raining heavily it's much worse. But the ultrasonic sensors are really good at detecting obstacles pretty much all around.
throwaway09223
Reverse lights are literally mandated by law. Your Tesla has them, and if they're not bright enough that's a fairly cheap and easy problem to fix relative to the alternatives.

Ultrasonics struggle in the rain, btw.

YZF
The sensors also detect obstacles on the side of the car where there's no lighting. Every problem has some sort of solution, but removing the ultrasonic sensors on the Tesla is going to result in poorer obstacle detection performance. Sure, if they add 360 lighting and more cameras they can make up for that.

EDIT: Also I'm not quite positive why the image is so dark when I reverse at night. But it still is. The slope and surface of the driveway might have something to do with that... Still I wouldn't trust that camera. The ultrasonic sensors otoh seem to do a pretty good job. That's just my experience.

EDIT2: I love the Tesla btw. The ultrasonic sensors seem to work pretty reliably, they're pretty much their own system, the argument about complexity doesn't really seem to hold water and on the face of it the cameras won't easily replace them...

somat
Just curious, do teslas have a reverse light?
kwhitefoot
Yes. Only one on my 2015 Model S 70D. Not sure how many on other Teslas.
cbsmith
Interesting. I find the rear camera in my Tesla is outright amazing in the dark. I can see objects so much more clearly with it than with the rear view mirror. It feels like I'm cheating... almost driving in the day.
xodjmk
You are greatly overestimating the functionality of the sensors, and underestimating the importance of the rest of the system. Sensors are important, but the majority of the work, effort and expense is involved with post-sensor processing. You can't just bolt a 'Lidar' on to the car and improve quality of results. Andrej and other engineers working on these problems are telling everyone the same story. The perfect solution is not obvious to anyone, and they have chosen one path. Engineers aren't trying to scam people out of a few dollars so they can weasel out of making high quality technology. This has Nothing to do with cost-cutting.
freejazz
"The perfect solution is not obvious to anyone, and they have chosen one path. Engineers aren't trying to scam people out of a few dollars so they can weasel out of making high quality technology. This has Nothing to do with cost-cutting."

It has everything to do with cost cutting?

xodjmk
Lidar vs. Stereo camera vs. multiple cameras vs. ultrasound is a separate problem that engineers are trying to solve, not how can we sell cheaper mops. The decision to not use Lidar, as he says, and is the common debate being explored by people working on autonomous driving is whether it makes more sense to focus on stereo image sensors with highly integrated machine learning, or maybe use Lidar or other sensors and include data Fusion processing. Both methods have trade-offs.
freejazz
"Lidar vs. Stereo camera vs. multiple cameras vs. ultrasound is a separate problem that engineers are trying to solve, not how can we sell cheaper fucking mops."

Okay? Tesla is a car company and they are absolutely trying to sell a cheaper car. That's obvious to anyone that's been in one.

"Both methods have trade-offs."

Right, isn't that why most other systems use both?

xodjmk
Both methods have trade-offs as in there are positive and negative merits for both approaches. Using both systems requires the sensor data to be fused together to make real-time decisions. This is the whole point, why people are trivializing this problem, and why it is easy to believe that they are just trying to scam people by going cheap on using multiple sensors. If you want to argue that it is better to use Lidar then explain why apart from 'others do it'. The podcast, and previous explanations by this guy and others that agree with him (which occurred way before some shortage issues) is about what is the best way to solve autonomous driving. You don't solve it by simply adding more sensors. There are multiple hours of technical information about why this guy Andrej thinks this way is best. Others make arguments for why multiple sensors and fusion makes more sense. No one knows the correct answer, it will be played out in the future. Maybe what some people care about is cheaper cars. That is not what the podcast was about, that is not how the Lidar + stereo camera vs. stereo-camera only decision was made. And in terms of the advancement of human civilization it is not interesting to me whether Tesla has good or bad quarterly results compared to what is the best way to solve the engineering problems & the advancement of AI, etc. I don't really care very much but it is slightly offensive when many people just dismiss engineers who are putting in tons of effort to legitimately solve complicated problems as if they are just scam artists trying to lie to make quick money. That is also a stupid argument. No company is going to invest billions(?) of dollars and tons of engineering hours into an idea they secretly know is inferior and will eventually lose out because they can have a good quarter. That is not a serious argument.
freejazz
I'm not sure why you'd assume all of that. You keep saying engineers, but it's a business decision. Seems like you are getting caught up in marketing.
xodjmk
I am an engineer working on autonomous vehicles. Nothing personal just responding to the thread as a whole. I don't believe this guy is conspiring to trick anyone. Business decisions, or course. I think they are in good faith gambling on this one approach. So I am interested to see if their idea will win, or if someone else figures out a better way.
freejazz
Okay so they are "good faith" gambling? I don't want to drive in a car that has any gambling... I don't get how it being in good faith (generous on your part) makes it less of a gamble?
YZF
There problem is not that he was wrong, the problem is that he's made a motherhood statement in response to a very specific question.

He's not conspiring to trick people per se but he's also not being super clear. His position obviously makes it difficult to answer this question. It's possible he really believes this is better but if he didn't he wouldn't exactly tell us something that makes him and his previous employer look bad. Also his belief here may or may not be correct.

Is it a coincidence that the technical stance changed at the same time when part shortages meant that cars could not be built and shipped because of shortages of radars?

More likely there was some brainstorming as a result of the shortages and the decision was made at that point to pursue an idea of removing the additional sensors and shipping vehicles without those. This external constraint makes believing the claims that this is actually all around better, while hearing some reports of increases in ghost braking (anecdotes) a little difficult. Not clear if there was enough data at that time to prove this and even Andrej himself sort of acknowledges that it's worse by some small delta (but has other advantages, well shipping cars comes to mind).

So yes, sensors have to be fused, it's complicated, it's not clear what the best combination of sensors is, the software might be larger with more moving parts, the ML model might not fit, a larger team is hard to manager, entropy - whatever. Still seems suspicious. Not sure what Tesla can do at this point to erase that, they can say whatever they want, we have no way of validating that.

xodjmk
Maybe you're right, I don't care about Tesla drama.

Here is one possible perspective from an engineering standpoint:

Same amount of $$, same amount of software complexity, same size of engineering teams, same amount of engineering hours, same amount of moving parts. One company focuses on multiple different sensors and complex fusion with some reliance on AI. Another company focuses on limited sensors and more reliance on AI. Which is better? I don't think the answer is clear.

The other point is that I am arguing that many people are over-stating the importance of the sensors. They are important, but far more important is the post-processing. Any raw sensor data is a poor actual representation of the real environment. It is not about the sensors, but about everything else. The brain or the post-sensor processing is responsible for reconstructing an approximation of the environment. We have to infer from previous learned experiences of the 3D world to successfully navigate. There is no 3D information coming in from sensors, no objects, no motion, no corners, no shadows, no faces, etc. That is all constructed later. So whoever does a better job at the post-processing will probably out perform regardless of the choice of sensors.

freejazz
People absolutely get that. Their issue is that Tesla is only relying on visual data and then on what is a disingenuous basis, insist that this is okay because humans "only need eyes" or some other similar sort of strawman argument.
galangalalgol
One interesting side effect of only using visual sensors is that the failure modes will be more likely to resemble human ones. So people will say "yeah, I would have crashed in that situation too!". With ultrasonic and radar and ladar it may make far fewer mistakes but it is possible they might not be the same ones people make, so people will say "how did it mess that up?"
mbreese
Sadly, that’s the worst way to actually design the system. I’d rather have two different technologies working together, with different failure modes. Not using radar (especially in cars that are already equipped) might make economic sense to Tesla, but I’d feel safer if visual processing was used WITH radar as opposed to instead of radar.

I also expect an automated system to be better than the poor human in the drivers seat.

aeternum
It's far from the worst way, because if humans are visually blinded by the sun or snow or rain they will generally slowdown and expect the cars around them to do the same.

Predictability especially around failure cases is a very important feature. Most human drivers have no idea about the failure modes of lidar/radar.

xodjmk
You have to eventually decide to trust one or the other, in real-time. So having multiple failure modes doesn't solve the problem entirely. This is called 'Fusion', meaning you have to fuse information coming from multiple sensors together. There are trade offs because while you gain different views of the environment from different sensors, the fusion becomes more complicated and has to be sorted out in software reliably in real-time.
Beldin
> You have to eventually decide to trust one or the other, in real-time.

More or less. You can take that decision on other grounds - e.g. "what would be safest to do if one of them is wrong and i don't know which one?"

The system is not making a choice between two sensors, but determining a way to act given unreliable/contradictory information. If both sensors allow for going to the emergency lane and stopping, maybe that's the best thing to do.

cbsmith
> There are trade offs because while you gain different views of the environment from different sensors, the fusion becomes more complicated and has to be sorted out in software reliably in real-time.

If you're against having multiple sensors though, the rational conclusion would be to just have one sensor, but Tesla would be the first to tell you that one of the advantages their cars have over human drivers is they have multiple cameras looking at the scene already.

You already have a sensor fusion problem. Certainly more sensors add some complexity to the problem. However, if you have one sensor that is uncertain about what it is seeing, having multiple other sensors, particularly ones with different modalities that might not have problems in the same circumstance, it sure makes it a lot easier to reliably get to a good answer in real-time. Sure, in unique circumstances, you could have increased confusion, but you're far more likely to have increased clarity.

xodjmk
This is one side of the argument. The other side of the argument is that what matters more than the raw sensor data is constructing an accurate representation of the actual 3D environment. So an argument could be made (which is what this guy and Tesla are gambling on and have designed the company around), is that the the construction & training of the Neural out-weighs the importance of the actual sensor inputs. In the sense that even with only two eyes (for example) this is enough when combined with the ability of the brain to infer the actual position and significance of real objects for successful navigation. So as a company with limited R&D & processing bandwidth, you might want to devote more resources to machine learning rather than sensor processing. I personally don't know what the answer is, just saying there is this view.
cbsmith
The whole point of the sensor data is to construct an accurate representation of the actual environment, so yes, if you can do that, you don't need any sensors at all. ;-)

Yes, in machine learning, pruning down to higher signal data is important, but good models are absolutely amazing at extracting meaningful information from noisy and diffuse data; it's highly unusual to find that you want to dismiss a whole domain of sensor data. In the cases where one might do that, it tends to be only AFTER achieving a successful model that you can be confident that is the right choice.

Tesla's goal is self-driving that consumers can afford, and I think in that sense they may well be making the right trade-offs, because a full sensor package would substantially add to the costs of a car. Even if you get it working, most people wouldn't be able to afford it, which means they're no closer to their goal.

However, I think for the rest of the world, the priority is something that is deemed "safe enough", and in that sense, it seems very unlikely (more specifically, we're lacking the tell tale evidence you'd want) that we're at all close to the point where you wouldn't be safer if you had a better sensor package. That means, in effect, they're effective sacrificing lives (both in terms of risk and time) in order to cut costs. Generally when companies do that, it ends in law suits.

jholman
Uh, that's not at all a good paraphrase.

Q: "Does [removing some sensors] make the perception problem harder, or easier?"

(note, this is literally what Lex asked, your restatement is misleading)

A: [paraphrasing] "Well more sensor diversity makes it harder to focus on the thing that I believe really moves the needle, so by narrowing the space of consideration, I think we'll get better results"

Karpathy might not be telling the truth, I don't know. But it's a much more credible pitch than you make it sound, because it's often true that you can deliver better by focusing on a smaller number of things. Engineering has always been about tradeoffs. Nobody is offering Karpathy infinite money plus infinite resources plus infinite time to do the job.

Again, I'm not saying Karpathy is honest or correct. I'm saying that the rephrasings in this comment and this thread are hilariously unfair.

flashgordon
Actually that's a fair point. My goal was only for simplification and not intended to malign.
funstuff007
> because it's often true that you can deliver better by focusing on a smaller number of things.

This is true / dogma in linear / non-linear regression world, but of no real import in deep learning or Bayesian methods.

Kerbonut
I think they’re talking about number of different systems doing the same thing. Have one system doing it that is sufficiently abstracted away from a common set of hardware vs various systems competing for various aspects of control.
jholman
Sorry, it's your opinion that researchers and/or engineers working on DL or Bayesian methods work better when they're distracted by many diverse tasks? What?
funstuff007
No, it's my opinion that in linear regression an inordinate amount of time is spent with feature selection and ensure there's no correlations among the features. When data is cheap in both X and Y, winnowing down X is a lot of work.
plumefar
There seem to be 2 points of view here. One technical (sensors and the algorithms), one organisational (people and teams working on the problem).

My understanding is that by focusing on fewer things (vision only), they bet to make progress faster because of the simplified organisational aspect.

oxfordmale
It is definitely a clever marketing pitch, as there is plenty of evidence to back up that LIDAR makes self-driving cars significantly safer. However, despite the hype, Teslas aren't really self driving cars at the moment, so it seems an acceptable commercial decision wrapped up in a clever sales pitch.
gizmo
That's also true for high resolution maps. The question is whether you're solving for self-driving on highways or a handful of mapped city centers or whether you want to solve for the real thing. Tesla is all-in on FULL self driving, and most other companies are betting on driver assistance or gps-locked self-driving. If Tesla can get FSD to work in the next couple of years then they're vindicated. If FSD requires a weak form of generalized intelligence (plausible) then FSD isn't happening anytime soon and investing in more sensors and GPS maps is correct.
voidwtf
High resolution maps do not give you an accurate 3D representation of nearby objects.

Our brains do an amazing job interpreting high resolution visual data and analyzing it both spatially and temporally. Our brains then take that first analysis and apply a secondary, experiential, analysis to further interpret it into various categories relevant to the current activity.

What I’ve seen from Tesla so far indicates to me that FSD shouldn’t be enabled regardless of what sensor package they’re using, let alone based on camera data only. They need to solve their ability to accurately observes their surroundings first, especially temporally. Things shouldn’t be flashing in and out that have been clearly visible to the human eye the entire time. Additionally, this all ignores the experiential portion of driving. When most people approach something like a blind driveway or crosswalk obscured by a vehicle (a dynamic, unmapped, situation), they pay special attention and sometimes change their driving behavior.

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.