HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning

DeepMind · Youtube · 26 HN points · 12 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention DeepMind's video "RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning".
Youtube Summary
#Reinforcement Learning Course by David Silver# Lecture 1: Introduction to Reinforcement Learning

#Slides and more info about the course: http://goo.gl/vUiyjq
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
I would start with David Silvers (DeepMind) youtube series to get an idea of what's possible or not.

Running an already trained reinforcement learning agent is relatively cheap (unless your model is massive).

I suspect the reason people aren't using it yet is because it's a) really difficult to get right in training, even basic convergence is not guaranteed without careful tuning b) really difficult to guarantee reasonable behavior outside of the scenarios you're able to reach in QA.

edit: Link to lecture series https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PLqYmG7hTra...

I almost knew exactly who it was going to be when they mentioned AlphaGo Developer.

For those who aren't that well versed in RL, I recommend watching his lectures at UCL (https://www.youtube.com/watch?v=2pWv7GOvuf0). Really clear explanations that went hand in hand when I was reading Sutton and Barto's introductory book.

manthideaal
Another link with pdf with lectures, exam and assigment and a link to video of lectures in (1)

(1) https://www.davidsilver.uk/teaching/

I also recommend interested people to watch David Silver's RL lectures at UCL on YouTube. He covers material from the book.

https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PL7-jPKtc4r...

I came across this a few days ago: For Reinforcement Learning specifically, the standard text is Reinforcement Learning: An Introduction[1]. Dave's UCL Course on RL[2] is great too (playlist of all lectures)[3].

Source: Julian Schrittwieser works on Deepmind at Google http://www.furidamu.org/

[1]http://incompleteideas.net/book/the-book-2nd.html

[2]http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html

[3]https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PL7-jPKtc4r...

kejaed
Thanks for this list. I’d like to get up to speed on RL and see how we can apply it for path planning and control of our gliding parachute UAVs.
jcims
Ooh! I want one of those for high altitude balloon payload recovery. Got any small ones or hobby grade projects doing this that you’re aware of?
kejaed
No hobby ones I know of, although I haven't looked into the hobby space too much. All I can say about smaller systems is give our BD guys a call... =D
mark_l_watson
thanks!!
MasterScrat
Beyond those, my favorite resources are:

- "An Introduction to Deep Reinforcement Learning" by Vincent François-Lavet et al (https://arxiv.org/pdf/1811.12560.pdf)

- "A (Long) Peek into Reinforcement Learning" by Lilian Weng (https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-...)

- "Deep Reinforcement Learning: Pong from Pixels" from Andrej Karpathy (https://karpathy.github.io/2016/05/31/rl/)

Those are the basics. Some more resources listed on this post: https://news.ycombinator.com/item?id=18219620

The notions of agent, environment and reward are encountered in Reinforcement Learning which is a sub-field of ML but also relevant to biological agents. There is a great RL course on YouTube by David Silver (one of the creators of AlphaGo, which is probably the most famous application of RL so far).

https://www.youtube.com/watch?v=2pWv7GOvuf0

The course can properly set the perspective of RL and shed new light on the philosophy of mind, if you take what it says and then extrapolate to humans and other agents.

What I find fascinating about RL is that it can be defined concretely. Consciousness can only be defined by reference to other words, in a less exact and concrete way. RL can also explain how meaning appears, based on future reward prediction. The rich sensations we have can be explained by an encoding-decoding architecture based on reconstruction error. Many difficulties in RL map back to difficulties humans have in choosing how to act - the exploration-exploitation tradeoff, instinct vs reason (two different ways to perform RL, one based purely on rewards and the other based on an internal model of the world and rewards). Some of the problems related to multi-agent RL are also covered in Game Theory, such as the prisoner's dilemma.

Regarding philosophy of language: we have today numerical representations of meaning in words and phrases. They are usually represented as vectors in high dimensional space or sets of vectors. For example it is customary to use 300-1000 dimensions for representing words. These vectors have a nice property - the closer two words are in meaning, the smaller the angle between the two vectors. They are derived by trying to predict a word by the context, or vice versa, on a corpus of text several Gb in size.

Many ideas from the philosophy of language, such as the meaning of words being related to the 'game' being played (activity with a purpose), emerge naturally from successful AI models. I'd say that where philosophy had a glimpse, AI has a testable implementation that can solve real world problems. Where philosophy uses mere words, AI uses probability distributions and datasets to define such models. The brain is probably doing something similar.

Other things that AI has managed to do so far: to encode images into latent representations and back, to synthesise images. Same for speech - we have speech recognition and text to speech. Some modules used in neural networks are analogous to imagination, attention, memory, emotion, intuition and many other aspects of the mind. On narrow domains computers can already best humans at perception.

The piece that is missing in AI to match human level is the prior knowledge encoded in our genes. We have been optimised to learn and function well in our environment by evolution. That means our verbal areas in the brain have a notion of invariance to time translation, visual areas have invariance to space translation, and conceptual areas have an invariance to permutation. There might be more invariances but we just don't know yet and that's why AI models are not yet up to par with humans. But we can still learn a lot about ourselves by analogy with AI agents. And that's where I think philosophy should listen.

As someone who is doing his bachelor thesis on Reinforcement Learning this is some useful information.

OT (but not really) question: does anyone here use Reinforcement Learning techniques at work? For the thesis I am working on black-box optimization of 2 variable functions with Reinforcement Learning (and comparing it with Bayesian Optimization techniques).

As someone else suggested the Sutton & Barto book is really great knowledge but I would also like to suggest these lessons (https://www.youtube.com/watch?v=2pWv7GOvuf0) by David Silver (who worked on AlphaGo)

Q6T46nT668w6i3m
I regularly use reinforcement methods for simple robotics problems. TensorFlow, et al. have made their use practical.
chudi
We used it to discover the best picture of a product in terms of click trough rate, we also found that the end user really don't want you to change the order of the pictures of their products
http://www.fast.ai/

https://www.udacity.com/course/intro-to-machine-learning--ud...

https://agi.mit.edu/

https://www.youtube.com/watch?v=eLbMPyrw4rw&list=PL6EE0CD029...

https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PL7-jPKtc4r...

https://www.youtube.com/watch?v=NfnWJUyUJYU&list=PLkt2uSq6rB...

igravious
fast.ai Making neural nets uncool again. fast.ai is dedicated to making the power of deep learning accessible to all.[0]

Udacity. Intro to Machine Learning: Pattern Recognition for Fun and Profit[1]

MIT 6.S099: Artificial General Intelligence[2]

Lecture Series on Artificial Intelligence by Prof. P. Dasgupta, Department of Computer Science & Engineering, Indian Institute of Technology, Kharagpur[3]

DeepMind. Reinforcement Learning Course by David Silver[4]

Stanford Winter Quarter 2016 class: CS231n: Convolutional Neural Networks for Visual Recognition.[5]

[0] http://www.fast.ai/

[1] https://eu.udacity.com/course/intro-to-machine-learning--ud1...

[2] https://agi.mit.edu/

[3] https://www.youtube.com/watch?v=eLbMPyrw4rw&list=PL6EE0CD029...

[4] https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PL7-jPKtc4r...

[5] https://www.youtube.com/watch?v=NfnWJUyUJYU&list=PLkt2uSq6rB...

You can check this Reinforcement Learning Course by David Silver on YouTube: https://www.youtube.com/watch?v=2pWv7GOvuf0&t=836s

By the way, I believe David Silver was the lead programmer for AlphaZero.

Mar 18, 2017 · deepnet on Ask HN: Best books on AI?
Sutton & Barto's Reinforment Learning complete this triumvirate

https://webdocs.cs.ualberta.ca/~sutton/book/the-book-2nd.htm...

David Silver's Reinforcement Course is based on Sutton & Barto

https://www.youtube.com/watch?v=2pWv7GOvuf0

thedailymail
For whatever reason, Sutton's was the first serious book I read in any area of AI. The balance between explaining the history, the concepts and the code is handled really well.
jumpCastle
I liked all three of them. I also feel Murphy's MLAPP provides additional useful material and is well written.
Jun 17, 2016 · deepnet on Generative Models
David Silver's Reinforcement Learning Course teaches from Sutton & Barto's book.

https://www.youtube.com/watch?v=2pWv7GOvuf0

Mar 13, 2016 · 25 points, 1 comments · submitted by magoghm
phodo
For reference, here are the books mentioned in the course:

Reinforcement Learning: An Introduction, by Richard S. Sutton and Andrew G. Barto

https://webdocs.cs.ualberta.ca/~sutton/book/ebook/the-book.h...

Algorithms for Reinforcement Learning, by Csaba Szepesvari

http://www.ualberta.ca/~szepesva/papers/RLAlgsInMDPs.pdf

edit: added other book

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.