My favourite ML and stats books

Despite the the advent of online courses (Coursera, Udemy, DataCamp), open courseware from big name universities and a plethora of information available now available with just a quick web search (of which I am a big fan of and utilise quite frequently, don’t get me wrong), there’s a certain appeal to working through a physical textbook. Not to mention that there’s an element of rigour, academic integrity and basic proof reading that is often absent from the blog posts of many Data Science/AI blogs (this one included). I present a list of my favourite books on Data Science and Machine Learning in a vague descending order of how much I like them. I’ve purposely excluded textbooks prescribed to me in uni.

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, by Aurélien Géron

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition

Density of maths: 🧮🧮🧮

Useable code: 🤖🤖🤖🤖🤖

Practical advice: 🏦🏦🏦🏦

Overall Rating: ⭐⭐⭐⭐⭐

This book is an absolute classic, perhaps my most turned to reference guide. It’s quite thorough, with a good blend of theory/maths, code snippets you can use as well as practical advice a data scientist. It has quite a good coverage of traditional machine learning, as well as deep learning (with TensorFlow). If I could only have one book on this list, I’d probably choose this one!

An Introduction to Statistical Learning with Applications in R

Density of maths: 🧮🧮🧮🧮

Useable code: 🤖🤖🤖

Practical advice: 🏦🏦🏦

Overall Rating: ⭐⭐⭐⭐⭐

Written by a group of Stanford stats professors, this was the text that I was obsessed with when I first got sucked into the world of machine learning. It’s heavier on theory than it is on useable code snippets or practical applications, but the explanations of the theory behind all the traditional machine learning algorithms is top notch. An amazing read, and it’s freely available online as well! This book has a lot of nostalgia value for me and holds a special place in my heart. Elements of Statistical Learning goes into a much deeper level of mathematical detail and is its companion text (also freely available online).

Automate the Boring Stuff with Python: Practical Programming for Total Beginners, by Al Sweigart

Density of maths: 🧮

Useable code: 🤖🤖🤖🤖🤖

Practical advice: 🏦🏦🏦🏦🏦

Overall Rating: ⭐⭐⭐⭐⭐

I first learned Python by following this book and its accompanying Udemy course. The book teaches you Python by having you create little programs that can help automate the boring stuff in your life. This is certainly a much more engaging way to learn Python – rather than step through boring examples of what a function is, control structures, modules etc it gets you all excited by the prospect of making cool little programs that will save you time. I still remember staying up until 6AM one day because I was so fascinated by the concept of creating a program that would automate my timesheet entries (back when I was a consultant at one of the Big Four). I still think it’s one of the better ways to pick up Python.

Deep Learning with Python, by Francois Chollet

Density of maths: 🧮🧮

Useable code: 🤖🤖🤖🤖🤖

Practical advice: 🏦🏦🏦🏦🏦

Overall Rating: ⭐⭐⭐⭐⭐

This is a really nice practical guide to implementing neural networks. It skimps a bit on the maths, but that’s fine as the book is focused on practical implementation. There are a fair few cool learning projects in the book to attempt as well. Of note is that this is written by the creator of Keras!

Introduction to Machine Learning with Python: A Guide for Data Scientists, by Sarah Guideo

Density of maths: 🧮

Useable code: 🤖🤖🤖🤖

Practical advice: 🏦🏦🏦🏦

Overall Rating: ⭐⭐⭐⭐

A more beginner oriented book than Hand-On Learning, this gives a good introduction of traditional machine learning. I’d probably recommend either this on Hands-on ML for a newcomer to DS.

The Art of Statistics: Learning from Data, by David Spiegelhalter

Density of maths: 🧮

Useable code: 🤖

Practical advice: 🏦

Overall Rating: ⭐⭐⭐⭐

This is more of popular science end-to-end read than a textbook, so naturally there won’t be much maths, code or practical advice. I wanted a bit of a refresher/overview on stats (which I studied as an undergrad) so I picked this up at the bookshop and knocked it over one weekend. It was quite a riveting read (for a book about stats). My only criticism was on the section on machine learning – I’m not sure if all the things said about how ML algorithms work were entirely accurate. That being said, the author’s area of expertise is in traditional statistics.

Artificial Intelligence: A Guide for Thinking Humans, by Melanie Mitchell

Density of maths: 🧮

Useable code: 🤖

Practical advice: 🏦🏦

Overall Rating: ⭐⭐⭐⭐

I’ve spent a lot of time looking at statistics, machine learning and AI from a technical/practitioners point of view, but I don’t have the best understanding of the history of AI as well as how to explain it to someone not in the field. Plus, most of the other books in this blog post aren’t considered light reading that you’d unwind with before bed, so it’s nice to have a book on ML that I can casually read before bed. It’s really interesting seeing the author explain concepts like activation functions, backpropagation and convolutional neural networks using everyday language!

The book gives a history of AI and some of the viewpoints surrounding where it will head before diving into some of the main areas of AI: deep learning, computer vision, game playing agents, NLP before concluding with some thoughts on the meaning of consciousness, where AI is at right now (the limitations of current AI and how far we are from general AI) as well as where AI might head in the future.

A really nice helicopter view of AI.

Natural Language Processing Crash Course for Beginners: Theory and Applications of NLP using TensorFlow 2.0 and Keras, by M. Usman Malik

Density of maths: 🧮 🧮

Useable code: 🤖🤖🤖

Practical advice: 🏦🏦🏦 🏦

Overall Rating: ⭐⭐⭐⭐

I picked this up for less than $6, and what I got was a surprisingly well structured and complete (if brief in some sections) guide to NLP. The author covers all the main areas of NLP, with plenty of code snippets and useful advice. My only criticism relates to the presentation: the English, whilst without any obvious spelling errors, is clunky and poorly phrased at times. Whilst it is still understandable, phrases like “sentimental analysis” stick out like a sore thumb. The maths is not rendered (with LateX or otherwise) and is presented as pseudo code of sorts and hard to read. I guess it’s to be expected for something that came out of an independent publisher. For less than $6 though, it’s pretty easy to recommend and I’m very happy with my purchase.

R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, by Hadley Wickham

Density of maths: 🧮

Useable code: 🤖🤖🤖

Practical advice: 🏦🏦🏦

Overall Rating: ⭐⭐⭐⭐

This is a quick practical guide to the Tidyverse (a collection of R packages for Data Science), from the creator of it himself! It’s jam packed full of code examples and quite a light read. It’s also freely available online.

Leave a comment