Hacker News Comments on
National Research University Higher School of Economics
How to Win a Data Science Competition: Learn from Top Kagglers
Hacker News Stories and CommentsAll the comments and stories posted to Hacker News that reference this url.
I quite liked "How to win a data science competition" (https://www.coursera.org/learn/competitive-data-science) where I learned a lot about validation strategies and machine learning on tabular data. The course has its own Kaggle competition.
I also really liked "Discrete optimization" (https://www.coursera.org/learn/discrete-optimization). At the time that I took it it also had a competitive element where you would solve optimization problems and there was a leader board comparing all the students in the current batch. That was when courses still started in batches and were free so the experience would probably no longer be the same, unfortunately.
⬐ light_hue_1> I quite liked "How to win a data science competition
As a machine learning researcher I am on the one hand glad that folks are learning more about the topic. On the other hand, this is totally the wrong approach and it will teach you the wrong lessons.
The idea that you can just treat data as a uniform dump of tables and that grinding your way to high numbers is somehow worthwhile is simply terrible. The resulting systems won't work well in the real world and they produce horrific explanations of what is going on. This class teaches you not just the wrong tools, like boosting, it teaches you the wrong mental model.
I really can't think of a worse introduction to ML than this class. Even not knowing anything would actually be better.⬐ jackallisok - what is the alternative?⬐ samvherInteresting. I definitely would not recommend this course as an only course in machine learning or indeed as an introduction, and I see where you’re coming from with the wrong mental models. I can’t be sure that I do have the right ones but I have taken a number of other courses as well and my sense is they’re ok.
My main takeaway from the course was definitely not that just grinding away for higher numbers is the right thing to do (but it might be a necessary evil in a competition context). The key thing I learned here was much more about paying very close attention that your validation strategy and your testing strategy are compatible because there are many ways you can mess it up, making your models valid in-sample only. Most of the other things I had done before were also more around SVMs and neural networks and getting some experience with decision tree based algorithms was interesting.
Two courses taught by faculty at the Russian institute HSE (Higher School of Economics).
1. How to Win a Data Science Competition https://www.coursera.org/learn/competitive-data-science
2. Bayesian Methods for Machine Learning https://www.coursera.org/learn/bayesian-methods-in-machine-l...
Kaggle is more than enough to get started. I would hire anyone who's Master there. Probably not even need for Master, just enough knowledge to explain why that thing work and that would not.
See this course to get into Kaggle: https://www.coursera.org/learn/competitive-data-science
⬐ rasikjainThank you for the inputs and course reference
Mostly Kaggle -- reading others solutions and notebooks and integrating them into mine code.
Also there's a great Coursera course on ML for Kaggle: https://www.coursera.org/learn/competitive-data-science
I think once you finish it, you're better than 60% of silicon valley data scientists, no kidding.