Hacker News Comments on
Learning From Data
·
7
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this book.I really enjoyed the books "Learning From Data"[1], and "Programming Collective Intelligence"[2].Both are accessible to beginners.
Learning From Data gives a more theoretical introduction to machine learning. One of the central ideas from the book that I still think about often is that machine learning is merely function approximation. There exists a function which will drive a car perfectly, but we don't know what that function is, so we try to approximate that function with machine learning.
Programming Collective Intelligence is a more hands-on introduction to machine learning. The book has examples in Python, but I believe the Python code is low quality. Ignoring the example code (and I did ignore it), the book is a very enjoyable introduction to many different machine learning algorithms. If you don't know the difference between linear regression, nearest-neighbors clustering, support vector machines, and a neural networks, this book will explain how each of these work and give a good intuition about when to use each.
[1] http://www.amazon.com/gp/product/1600490069 [2] http://www.amazon.com/Programming-Collective-Intelligence-Bu...
Prof Yaser S. Abu-Mostafa's Caltech course "Learning from Data" (http://work.caltech.edu/telecourse) is probably the best introductory course for really understanding the physics of how machine learning works.See Prof's Yaser's 1 min overview: http://www.youtube.com/watch?v=KlP0DpiM7Lw
The "Learning from Data Book" videos are online for free, and the book is on Amazon...
Videos: http://home.caltech.edu/lectures.html
Book: http://www.amazon.com/Learning-From-Data-Yaser-Abu-Mostafa/d...
The course is also availble on EdX: https://www.edx.org/course/caltechx/cs1156x/learning-data/11...
⬐ eli_gottliebI've got the book. It's a great book, even though the Machine Learning course here at Technion is more Bayesian than AML's seemingly PAC and VC-focused book.
For a glimpse into machine learning, check out Professor Yaser Abu-Mostafa's "Learning From Data" course from Caltech. The videos are online for free (http://work.caltech.edu/telecourse.html, https://www.edx.org/course/caltechx/cs1156x/learning-data/11...), and its corresponding book is on Amazon (http://www.amazon.com/Learning-From-Data-Yaser-Abu-Mostafa/d...).Also Professor Ng's course from Stanford (http://cs.stanford.edu/people/ang/?page_id=22).
Don't worry about needing to catch up. Stuff is moving so fast these days, you're always working with something new. Everyone is in a continual update mode so it's not like you have 10 years of catching up to do. Tech has turned over a 10 times since then. You could say 10 years and 2 years are functionally equivalent from a new tech point of view.And don't worry about corps and recruiters. Focus on a problem you want to solve, and update your skills in the context of learning what you need to know to solve that problem. If you can leverage your industry experience in the problem domain, even better.
Data is driving everything so developing a data analysis/machine learning skillset will put you into any industry you want. Professor Yaser Abu-Mostafa's "Learning From Data" is a gem of a course that helps you see the physics underpinning the learning (metaphorically of course -- ML is mostly vectors, matrices, linear algebra and such). The course videos are online for free (http://work.caltech.edu/telecourse.html), and you can get the corresponding book on Amazon -- it's short (http://www.amazon.com/Learning-From-Data-Yaser-Abu-Mostafa/d...).
Python is a good general purpose language for getting back in the groove. It's used for everything, from server-side scripting to Web dev to machine learning, and everywhere in between. "Coding the Matrix" (https://www.coursera.org/course/matrix, http://codingthematrix.com/) is an online course by Prof Philip Klein that teaches you linear algebra in Python so it pairs well with "Learning from Data".
Clojure (http://clojure.org/) and Go (http://golang.org/) are two emerging languages. Both are elegantly designed with good concurrency models (concurrency is becoming increasingly important in the multicore world). Rich Hickey is the author Clojure -- watch his talks to understand the philosophy behind the design (http://www.infoq.com/author/Rich-Hickey). "Simple Made Easy" (http://www.infoq.com/presentations/Simple-Made-Easy) is one of those talks everyone should see. It will change the way you think.
Knowing your way around a cloud platform is essential these days. Amazon Web Services (AWS) has ruled the space for some time, but last year Google opened its gates (https://cloud.google.com/). Its high-performance cloud platform is based on Google search, and learning how to rev its engines will be a valuable thing. Relative few have had time to explore its depths so it's a platform you could jump from.
Hadoop MapReduce (https://hadoop.apache.org/, http://www.cloudera.com, http://hortonworks.com/) has been the dominant data processing framework the last few years, and Hadoop has become almost synonymous with the term "Big Data". Hadoop is like the Big Data operating system, and true to its name, Hadoop is big and bulky and slow. However, there is a new framework on the scene that's true to its name. Spark (http://spark.incubator.apache.org/) is small and nimble and fast. Spark is part of the Berkeley Data Analytics Stack (BDAS - https://amplab.cs.berkeley.edu/software/), and it will likely emerge as Hadoop's successor (see last week's thread -- https://news.ycombinator.com/item?id=6466222).
ElasticSearch (http://www.elasticsearch.org/) is a good to know. Paired with Kibana (http://www.elasticsearch.org/overview/kibana/) and LogStash (http://www.elasticsearch.org/overview/logstash/), it's morphed into a multipurpose analytics platform you can use in 100 different ways.
Databases abound. There's a bazillion new databases and new ones keep popping up for increasingly specialized use cases. Cassandra (https://cassandra.apache.org), Datomic (http://www.cognitect.com/), and Titan (http://thinkaurelius.github.io/titan/) to name a few (http://nosql-database.org/). Redis (http://redis.io/) is a Swiss Army knife you can apply anywhere, and it's simple to use -- you'll want it on your belt.
If you're doing Web work and front-end stuff, JavaScript is a must. AngularJS (http://angularjs.org/) and ClojureScript (https://github.com/clojure/clojurescript) are two of the most interersting developments.
Oh, and you'll need to know Git (http://git-scm.com, https://github.com). See Linus' talk at Google to get the gist (https://www.youtube.com/watch?v=4XpnKHJAok8 :-).
As you can see, the opportunities for learning emerging tech are overflowing, and what's cool is the ways you can apply it are boundless. Make something. Be creative. Follow your interests wherever they lead because you'll have no trouble catching the next wave from any path you choose.
⬐ jnardielloThanks for this. Quite incredibly valuable comment. This is why i love HN.⬐ christiangencoI'm a web developer that considers myself "up-to-date" but there was quite a bit in there that I need to read up on (notably Hadoop and ElasticSearch). Thanks for the links!I'd also recommend, as some alternatives:
* Ruby as an alternative "general purpose language"
* Mongo as an alternative swiss army database
* Backbone + Marionette as an alternative front-end JS framework
* CoffeeScript as a better Javascript syntax
To tag onto this, I found "Learning from Data" [0] by Abu Mastafa to be a great intro to the field. It's not heavy on the math, but it doesn't gloss over it either[0]:http://www.amazon.com/Learning-From-Data-Yaser-Abu-Mostafa/d...
⬐ vevenUnfortunately Amazon won't ship this book outside of the United States.⬐ hypertext⬐ scottedwardsActually, according to the authors' website (http://amlbook.com/), Amazon does ship the Learning from Data book to many different countries outside the US.⬐ winter_blueYou should be able to get (illegal) PDFs of most popular books with a simple Google search. I found a PDF of the ML book I mentioned earlier as the top result on Google for "<name of book> pdf".Admittedly epub is a better format, because it naturally reflows on smaller screens, but "free" epubs are harder to come across. I've been thinking of converting some really good PDFs that I have, to ePub myself, but just haven't gotten around to it yet.
Have to agree. And it's very inexpensive because Yaser refused to give-in to academic publishers, who would've charged the typical $70-80, and self-published so he could offer it for less than half the cost.Not only is the book great, but his lectures are PHENOMENAL. He breaks concepts down in such a careful, accessible way. Its a bit late to join the online course, but you can see all the lectures on YouTube (work.caltech.edu/telecourse.html) or iTunesU (I prefer the latter, using the app on iOS - awesome b/c you can bookmark and record notes at those marks - otherwise I notice these video types of courses are way less useful - no way to review - wish Coursera/Udacity/EdX had that feature.)
Yaser is an awesome guy btw - he's very active on the forum (see the link from the above caltech site - on right hand side). He is very gracious with his time - I'm not a CalTech student, and yet he has answered all my questions and even helped me find a tutor for the course that was a previous student at CalTech (I live in Pasadena). He truly cares - and that comes off in the lectures as well. Enjoy!
⬐ antmanI take notes on all videos with http://videonot.es⬐ manish_gillAgreed with everything you said. Only thing that's missing from his lectures are the homework assignments, which are only available to those who signed up for the online course (signups are closed now), and I can't even make a post about it on the forums, because I don't have the book. :(
The course teacher's book costs $828 + $4 shipping:http://www.amazon.com/gp/offer-listing/1600490069/
Is this an error?
⬐ andymatuschakHis book is not yet available. That seller is likely not legitimate.⬐ charlielFrom http://amlbook.com, which is hidden within the page http://www.amazon.com/gp/product/1600490069, the book will become available on Mar26 on amazon for $28, not $828.