HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Google's DeepMind AI Just Taught Itself To Walk

Tech Insider · Youtube · 30 HN points · 4 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention Tech Insider's video "Google's DeepMind AI Just Taught Itself To Walk".
Youtube Summary
Google's artificial intelligence company, DeepMind, has developed an AI that has managed to learn how to walk, run, jump, and climb without any prior guidance. The result is as impressive as it is goofy.

Read more: http://www.businessinsider.com/sai

FACEBOOK: https://www.facebook.com/techinsider
TWITTER: https://twitter.com/techinsider
INSTAGRAM: https://www.instagram.com/businessinsider/
TUMBLR: http://businessinsider.tumblr.com/
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Hardware is hard.

Making an AI that learns to control a purely virtual body appears to be a fairly standard project.

1) https://youtu.be/XrOTgZ14fJg

2) https://youtu.be/882O_7hsAms

3) https://youtu.be/1kV-rZZw50Q

4) https://youtu.be/gn4nRCC9TwQ

ambicapter
A virtual body in a virtual world. I would say "real world is hard" is the more likely conclusion than "hardware is hard".
Apr 20, 2022 · 1 points, 0 comments · submitted by ColinWright
I'm not sure about doing that with physical robots, but I've seen a couple interesting iterations using simulated robots

https://www.youtube.com/watch?v=gn4nRCC9TwQ

Jul 09, 2018 · mockingbirdy on Logic Theorist
If you watch videos of AIs learning to run [1] or playing Mario [2], you see that there's definitely "try random shit until your reward function gives you positive feedback" as a method for training AIs.

The difficult part is developing the reward function. There's a lot of intelligence in our hormone-based incentive system - that we feel pain, that we feel sad, etc. Many of those emotions are pre-programmed. If we find a way to design reinforcement systems with the right goals, we can do it like mother nature did it for several billion years. But it's still a ton of computational power required.

If we look at nature as a massively parallelized computational system that defined "just live" as the reward (not because it "wants" it, but because it just emerged), it shows us how much power we would need if we would try to build a completely random process that gains intelligence.

I think your proposed method works ("just assemble random solutions and test them") because we've already seen it in nature. But it takes a lot of time and energy, regardless of our computational power. I think we're faster when we find the right boundaries and reward functions instead of randomly trying stuff.

[1]: https://youtu.be/gn4nRCC9TwQ [2]: https://youtu.be/qv6UVOQ0F44

s-shellfish
> If we look at nature as a massively parallelized computational system that defined "just live" as the reward (not because it "wants" it, but because it just emerged), it shows us how much power we would need if we would try to build a completely random process that gains intelligence.

With that definition I can't see how artificial life created would be any much different than the behavior of a computer virus.

> I think we're faster when we find the right boundaries and reward functions instead of randomly trying stuff.

Boundaries and reward functions in human society are part of the 'human social organism' that allows each individual human to function with autonomy in a fashion that is collaborative, aligned with our developed value systems, and allows us to live with stability, security, faith (be it in some sense of wonder, divinity, in each-other - doesn't matter), independence - and these things are base needs, no matter what variation they manifest with. Boundaries are redefined when there is conflict, and the less violent and destructive the conflict, the better the chances are for these boundaries and reward functions to continue functioning as they have been developed (rather than being obliterated in entirety, requiring them to built from scratch).

It makes us faster, but it's also us standing on the shoulders of giants. And I think it's important to question what things are worth applying random solutions to, and what things require deep contemplation. It seems very paradoxical, to try to define something that both is a function of the system it exists in, and also something that could potentially break the whole system. But, creation, destruction, clearly an oversimplification.

I do know that what looks random to one person is not necessarily random to another. This goes back to how the context is defined, how society is defined, boundaries and reward functions established a priori.

Emotions can be simple. The problem is that a computer already has them. Computer produces wrong solution, algorithm dies. Computer produces right solution, algorithm survives. We don't give the computer words to express itself about this because we never taught it to do that. What would happen if we did?

mockingbirdy
> try to define something that both is a function of the system it exists in, and also something that could potentially break the whole system

That is what's so damn hard about shaping societies (and the market). The recursive feedback loop and the self-interference.

It's also on individual levels: We're able to dynamically adapt our reinforcement system using meta-cognition (based in the prefrontal neocortex).

I understand the other parts of the answer, but I can't really see what you try to say with them (e.g. boundaries, society and deciding if we apply randomness or not).

s-shellfish
I tend to be of the mindset that over the long term, you are going to get a good perspective if you approach the problem 50/50. Apply randomness half the time, apply the knowledge you know the other half. Divide and conquer, sort of.

Randomness potentiates the space that you will both be able to identify error and consequently, find mistakes, find errors in reasoning (Monte Carlo simulations are traditionally used for this sort of thing). The other good thing with randomness as well, find ways to see errors as being 'not errors', e.g. a tool that can be developed, structured, a new way to see the problem, a creative approach. Some melding between the two seems to be something of significance for an AI. So you don't want it to be all random, because you need stability, structure, you need a 'language' or an 'awareness' you understand that isn't so chaotic and constantly changing that you can't even find a place to put your feet on the ground.

It doesn't have to be a perfect 50/50 balance, because that has it's own set of problems - divide and conquer all the divisions and you still wind up with 2^n newly defined problem spaces to interpret, possibly losing sight of the bigger picture or maintaining independent direction in one focused lineage of all those spaces. Just very generally, maintain balance, because the world is chaos sometimes.

It's like a stream of information. All the analysis to all of that space is meaningless if the context changes enough. So, adapt to a new context.

Honestly though, I don't know what I'm trying to say much of the time, aside, 'help', lol. Life is terrifying. :)

mockingbirdy
> Honestly though, I don't know what I'm trying to say much of the time, aside, 'help', lol. Life is terrifying. :)

from another comment made by you:

> Evolution does not care if an asteroid hits the earth and wipes out all life as it presently exists.

My advice: Stop worrying. Enjoy the randomness :) Maybe book a flight to Asia. Life is short. Embrace uncertainty. Sell luxurious sanitary pads to rich women in their 40s. Dress well and be funny. Then suddenly change your mood to sexy, ask a stranger for a kiss. I now write random love letters to my ex-girlfriends. Let's see where randomness will lead us to. See you on the other side of existence.

mockingbirdy
Found the boundaries pretty early. Wearing nothing but socks and singing "Why does it hurt when I pee?" while trying to cross a freeway is not considered "appropriate in the public". shmocks everywhere.

I'm in jail now. I'm free now. But a little cold.

s-shellfish
Meh.
None
None
Oct 12, 2017 · abledon on Competitive Self-Play
It didn't annoy me, but it definitely gave off a Disney Trailer vibe.

I can just hear the deep announcer voice over the video coming in:

"Coming Winter 2017, A movie about adventure, friendship, AND simulated self-play! Join Burp, Jenny and Bop on a fantastic adventure .... blah blah ..." etc.. hah

Heres an example of good, unobtrusive but fun music for showing AI bots: https://www.youtube.com/watch?v=gn4nRCC9TwQ

Jul 14, 2017 · 1 points, 0 comments · submitted by michaelmwangi
Jul 13, 2017 · 27 points, 5 comments · submitted by Huhty
bko
If anyone is interested in learning more how reinforcement learning works, I would highly recommend openai gym [0]. It provides a controlled environment where you can develop learning algorithms to accomplish an objective goal. It also has a walkers similar to the one in the video [1]. For learning more about reinforcement learning, I would recommend Richard S. Sutton book Reinforcement Learning: An Introduction. He has a new draft available free online [2]. Or the course Practical RL [3].

[0] https://gym.openai.com/envs

[1] https://gym.openai.com/envs#mujoco

[2] http://ufal.mff.cuni.cz/~straka/courses/npfl114/2016/sutton-...

[3] https://github.com/yandexdataschool/Practical_RL

honestoHeminway
We are pretty good at walking and moving fast forward..

I wonder how fast a AI could get at climbing- walking alterations, when given a challenging terrain..

pepon
one more sensationalist headline for deepmind

https://www.google.es/search?q=mujoco+learning+to+walk#q=muj...

lunlelo
Thank you for the vid
throwaway198411
silly deepmind, toes are for walking heels are for balance
anotheryou
Clearly needs some energy constraint factored in to look less like monty python :) I also wonder if the gravity is realistic.

A similar video from 2013: https://www.youtube.com/watch?v=pgaEE27nsQw Looks much smoother, because of muscle and nerve simulations. At 0:55 you can see the generations learning how to walk.

Jul 13, 2017 · 1 points, 0 comments · submitted by pathompong
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.