HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Computer program that learns to play classic NES games

suckerpinch · Youtube · 29 HN points · 23 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention suckerpinch's video "Computer program that learns to play classic NES games".
Youtube Summary
This is an explanation and demo of software I wrote that learns how to play a Nintendo Entertainment System game and then automatically plays it. This is real.

Research paper published in SIGBOVIK 2013: "The first level of Super Mario Bros. is easy with lexicographic ordering a and time travel ...after that it gets a little tricky."
http://tom7.org/mario/mario.pdf

Follow-up video with more games: http://www.youtube.com/watch?v=YGJHR9Ovszs
And episode 3: http://www.youtube.com/watch?v=Q-WgQcnessA

For more info and source code, see: http://tom7.org/mario/
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
tom7 is awesome. He made an awesome and hilarious video years ago about his SIGBOVIK paper/research on AI playing NES: https://www.youtube.com/watch?v=xOCurBYI_gY
There was an AI created the absolutely brilliant Tom7 on on his YouTube channel 'SuckerPinch' on April 1, 2013 that paused a game of Tetris instead of losing. In my opinion, Tom doesn't get enough credit for his work. He released his AI a full 8+ months before DeepMind released their Atari playing AI.

Watch the whole thing, it's delightful. The AI pauses Tetris right before it loses at around 16mins into the video.

https://www.youtube.com/watch?v=xOCurBYI_gY

I also recommend his other videos. Brilliant guy.

Edit: Here's the paper http://tom7.org/mario/mario.pdf

etrain
Tom7’s sigbovik submissions are true works of art. https://www.cs.cmu.edu/~tom7/mario/mario.pdf
UncleMeat
My personal favorite is the compiler that uses only the printable bytes.
Jun 03, 2018 · onli on Reverse-emulating the NES
I'm not sure many people can pull this off. The author here was also great in his presentation about automated NES playing, https://www.youtube.com/watch?v=xOCurBYI_gY. You don't see often academics & hacker work that is also combined with the right amount of humor. He even 3Dified the original Zelda!
This reminds me of AI research using NES Games. The AI eventually became proficient at completing Mario levels, and along the way it discovered novel strategies for survival, obtaining points, and finishing levels.

> Check out this timestamp to watch the machine "cheat": https://youtu.be/xOCurBYI_gY?t=9m55s

> Researcher's site about the project: http://www.cs.cmu.edu/~tom7/mario/

> The Paper: The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel...after that it gets a little tricky.: http://www.cs.cmu.edu/~tom7/mario/mario.pdf

iforgotpassword
This guy. One of my favorite YouTube channels. Releases something like once a year but oh boy, worth the wait. Check it out if you're a nerd and like creative/useless stuff. ;)
atldev
Well worth the watch. His quote when the AI pauses the game is gold!
ballenf
Speaking of easy, I spent many an hour playing that Qbert version on Atari and a decent number of quarters spent on the arcade version.

The atari version even on the hard setting was almost fatally dumbed down to be mindless. The enemies were just way dumber than in the arcade version. The game really didn't even feel like Qbert.

With just a little practice, one could play on a single life for as long as desired. Similar to Asteroids on Atari.

pbhjpbhj
Lol, that Youtube video, at the end the AI pauses the game of Tetris forever so as not to lose.
Sohcahtoa82
I'd call it the first example of AI rage-quitting.
dpflan
Agreed, that may be the cleverest thing the AI did.
maxander
Another case where “the winning move is not to play.”
gkilmain
I have a friend who I play chess against and every time he's about to lose he offers me a draw. And will continue offering me a draw until I win. Not happy to see the AI performing like that bitch MukyMuky.
gowld
Does that cause an interruption while you are playing?

Do you play on chess.com? That seems to attract unsporting players.

https://www.chess.com/forum/view/general/request-for-disabli...

gkilmain
Yes, it is on chess.com
notahacker
In repeat games your friend's strategy might actually be counterproductive, assuming he's playing against people that are rational enough to figure out that he always starts offering a draw round about the time he expects to lose, but not so good at chess that they sometimes don't sometimes miss opportunities for a quick checkmate...

I'd be disappointed if an AI chess computer invariably let me know that I had a better path to winning the game than it did.

gkilmain
That is exactly what happened. He played a friend of mine whose much lower rated. Eventually the lower rated player found himself in winning position but when the draw was offered he assumed he must be missing something and accepted it. So it worked once.
Thanks for pointing that out. I wouldn't have noticed who that was otherwise.

His video on learnfun/playfun is both hilarious and amazing.

https://www.youtube.com/watch?v=xOCurBYI_gY

Here is an actual implementation of trying to implement a general AI to play all NES games. It's pretty interesting, but his approach does involve the algorithm having access to the raw NES memory map:

https://www.youtube.com/watch?v=xOCurBYI_gY

Once again, reminds me of Tom Murphy's work more than 4 years ago: https://www.youtube.com/watch?v=xOCurBYI_gY
To reference: https://youtu.be/xOCurBYI_gY?t=15m57s

It's fun how program learning to play teris pauses the game just before loosing.

pitaj
Source of the reference is War Games (1983). https://www.youtube.com/watch?v=6DGNZnfKYnU
reminds me a bit of Tom7's Playfun: https://www.youtube.com/watch?v=xOCurBYI_gY
mngrwl
This was fucking amazing! Thanks!
dalant979
this is one of my favorite videos on the internet. always brings a smile to my face.
Why not use game emulators? With popular NES emulators you can advance the game frame by frame. You can read the raw memory addresses that correspond to the score. You can dump the memory at any time and reload the game to a specific game state. You can even manipulate the games in many fun ways by messing around with the game memory. Or give an AI algorithm access to memory addresses as additional information, instead of relying on pure machine vision, if you want to do that..

Here's an example of a guy who made a general game playing algorithm that brute forces it's way through any NES game: https://www.youtube.com/watch?v=xOCurBYI_gY This isn't necessarily interesting from an AI perspective - the playing algorithm is just brute force. But it shows what can be done with the platform, easily reloading to previous states and exploring counterfactual futures (which is exactly the sort of thing RL algorithms do.) He also has a cool algorithm for finding the objective function of an arbitrary game, by watching a human play, and seeing what memory addresses increment. Which is a lot more easy to use than writing OCR code to read the score and game over states from the screen.

You should look at some other things the author did. They fit in the same area of craziness, and are absolutely great. For example https://www.youtube.com/watch?v=QVn2PZGZxaI, Portmantout: A portmanteau of every English word, or the first one, https://www.youtube.com/watch?v=xOCurBYI_gY, Computer program that learns to play classic NES games. I also liked the humor in What, if anything, is epsilon?, https://www.cs.cmu.edu/~tom7/papers/epsilon.pdf.
missing the best video about teaching computers to play video games: https://www.youtube.com/watch?v=xOCurBYI_gY
In the context of AI and gaming, I definitely recommend this series of three youtube videos:

https://www.youtube.com/watch?v=xOCurBYI_gY

Some games are played better than human would.

symmetricsaurus
These videos by Tom7 are indeed great. They don't display the use of deep learning though; the method is something quite different [1].

Maybe you could use the objective functions of Tom7 to determine which moves are good or bad in order to train a policy network?

[1]: http://www.cs.cmu.edu/~tom7/mario/mario.pdf

If you're interested in a "dumber" approach to machines "learning" to play classic video games (including Super Mario Bros), tom7's video [1] and paper [2] on Playfun are both fascinating.

[1] https://www.youtube.com/watch?v=xOCurBYI_gY

[2] https://www.cs.cmu.edu/~tom7/mario/mario.pdf

Apr 06, 2016 · 3 points, 1 comments · submitted by d33
d33
There are three episodes of that - basically the algorithm is trying to maximize the X pos in the scrolling platform games while being able to try any combination of controller buttons and see what would happen in the future. Absolutely amazing.
Tom 7's work on machine learning and NES emulation is very impressive. This isn't the first time he's attacked that space; you can see previous experiences he had with an automatic game-playing AI he built called "LearnFun / PlayFun" on that YouTube channel also:

https://www.youtube.com/watch?v=xOCurBYI_gY

https://www.youtube.com/watch?v=YGJHR9Ovszs

https://www.youtube.com/watch?v=Q-WgQcnessA

None
None
Mar 13, 2016 · 1 points, 0 comments · submitted by doener
So as someone with only a lay understanding of what NP-Hard means - does anyone remember that neural network that someone trained to play the game[1]?

So if you can train a neural network to solve an NP-Hard problem, does that mean by definition (since NP refers to a whole class of problems) that a neural network could eventually be trained to solve any given NPH problem given the time and compute resources to do so?

[1]: https://www.youtube.com/watch?v=xOCurBYI_gY

yoavz
If a problem is NP-Hard, it does not mean it cannot be solved. Rather, it means it cannot be solved efficiently (in polynomial time). Most NPH problems, now including Mario and 3SAT, can be solved given the time and compute resources to do so.
lorenzhs
Well, to be exact, it's an open question whether they can be solved efficiently or not. The general assumption is that they can't, but it hasn't been proven yet.
gohrt
The actual games are not NP-hard problem generators. The rules of games can be generalized to create NP-hard problems.
jerf
Amplifying on the other posts, just because a problem is an instance of a class that is NP-hard doesn't mean the specific problem is that hard. As mentioned elsewhere in the conversation, subset sum is NP-complete [1], but, metaphorically, if Mario is capable of expressing the generalized subset sum problem, the human-played levels all amount to "Is there a submultiset of the numbers {1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1} that adds to 4?" Not only are all the levels completeable, they're basically trivially so, because what Mario game is there that involves solving sizable logic problems to progress? (Even the Ghost houses are just searches of a very small graph by computer science standards.)

[1]: https://en.wikipedia.org/wiki/Subset_sum_problem

runholm
A neural network does not give any guarantee of solving the problem within any number of iteration. NP-hard problems can be solved, but the solution take a lot of time to find and a slightly larger problem is even harder to solve.

Neural networks is an AI method of searching the solution space in a more intelligent way than "trying everything" (true for most AI methods actually).

A similar work was done for Super Mario on NES (along with other NES games): https://www.youtube.com/watch?v=xOCurBYI_gY
Haha, I love it.

There was a SIGBOVIK submission a couple years ago that looked at byte patterns within video game RAM (specifically NES) to automate the playing of. Pretty entertaining stuff.

https://www.youtube.com/watch?v=xOCurBYI_gY

Mar 08, 2015 · 1 points, 0 comments · submitted by Kortaggio
Mar 08, 2015 · 2 points, 0 comments · submitted by thewizardofmys
Feb 23, 2015 · 1 points, 0 comments · submitted by charlieegan3
Apr 13, 2013 · 4 points, 1 comments · submitted by s0rce
sp332
This is pretty awesome. I love the bit at the end where it pauses the game to avoid losing!
Apr 12, 2013 · 2 points, 0 comments · submitted by crabasa
Apr 11, 2013 · 1 points, 1 comments · submitted by jashmenn
jashmenn
Algorithm starts working at 7:38 - http://www.youtube.com/watch?v=xOCurBYI_gY&t=7m38s
Apr 11, 2013 · 5 points, 2 comments · submitted by macmac
acchow
"The only winning move is not to play."
macmac
Full title of paper on the program is:

"The first level of Super Mario Bros. is easy with lexicographic ordering a and time travel ...after that it gets a little tricky."

The title alone merits reading the paper.

Apr 10, 2013 · 1 points, 0 comments · submitted by mike_esspe
Apr 10, 2013 · 6 points, 0 comments · submitted by myleshenderson
Apr 03, 2013 · scott_s on None
Also, video version: http://www.youtube.com/watch?v=xOCurBYI_gY
Apr 02, 2013 · 2 points, 0 comments · submitted by rntz
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.