Article Index

In contrast to 2015, when the Data Science hype was rolling across Germany, we now do have more and more universities offering specializations or whole degrees in Data Science and tightly related disciplines. Yet, still many aspiring Data Scientists have completed different studies, e.g., computer science, physics, mathematics or economics. To be honest - I highly appreciate the diverse backgrounds of Data Scientists I have been collaborating with.

Coming from a non-DS field, you'll need to do some additional homework in order to keep up with ML natives. But in the age of MOOCs (massive open online courses), there are enough offerings. I'll try to provide an overview and give some guidance. 

This guide does not claim to be complete in any way, but hopes to be informative. If you feel like a relevant course is missing, feel free to contact me.

Introductory talks

If you need some overview about what AI and Data Science are and what they can be used for, you can start in this section. The duration is below 1h, so you should get lots of insights in a minimal amount of time. Especially as a business person trying to understand some aspects and implications, these might be a good choice.

Yufeng Guo (Google AI Adventures) - What is Machine Learning?

In some brief videos Yufeng gives a nice introduction into Machine Learning, explaining basic concepts and the process of ML. 

What is Machine Learning?

The seven steps of Machine Learning

Kevin Kelly - How AI can bring on a second industrial revolution

"Only by embracing AI we can steer it". Kevin Kelly talks about cognification and the many facets of intelligence. He then describes how AI will drive a second industrial revolution.

Very inspirational talk that helps understanding the big picture.

Nick Bostrom - What happens when our computers get smarter than we are?

Swedish philosopher Nick Bostrom turned famous by writing his book "Superintelligence: Paths, dangers, strategies". Here, he shares ome of his thoughts around the hypothetical problems coming up with the rise of superintelligent systems. He explaings the risks associated to it and elaborates ways to mitigate the risks from the beginning.

A must-view if you are interested in AI and Machine Learning. 

Kenneth Cukier - Big data is better data

Old, but gold - in 2014, Kenneth Cukier, Data Editor at The Economist, gives a quick insight how Big Data and Machine Learning are going to change our lives and the big responsibilites that come with Big Data.

Still very true, and definitely worth watching.

Grady Booch - Don't fear superintelligent AI

IBM's Grady Booch gives a nice statement regarding superintelligent AI. He opposes to great minds like Nick Bostrom, Stephen Hawking and Elon Musk. His way of presenting is very entertaining, but his thoughts are deep. He includes allusions to many movies like 2001, The Terminator and Matrix.

Even though I do not agree to everything, it provides nice impulses to start thinking. 


Full MOOCs

If you really want to achieve some progress you need to spend more time and learn more than by listening to some introductions. I've collected some MOOCs that run over multiple weeks. Some of them are free, as long as you don't need a certificate. Others cost some bucks, but this investment is definitely worth it - as long as you really complete the course

Deep Learning specialization

by Andrew Ng + 2

https://www.coursera.org/specializations/deep-learning

This MOOC is probably the most famous. It is being taught by Andrew Ng who is founder of deeplearning.ai and Co-Founder of coursera. The specialization consists of 5 courses, ranging from general introduction to the concepts of deep neural networks to Convolutional Neural Networks (CNN, typically applied to images) and Sequence Models (best choice for langage and speech) and the implementation in TensorFlow.

Machine Learning Operations for production specialization

by Andrew Ng + 3

https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops

This MOOC is probably the most famous. It is being taught by Andrew Ng who is founder of deeplearning.ai and Co-Founder of coursera. The specialization consists of 5 courses, ranging from general introduction to the concepts of deep neural networks to Convolutional Neural Networks (CNN, typically applied to images) and Sequence Models (best choice for langage and speech) and the implementation in TensorFlow.

 

Time Series Forecasting

by Toni Moses 

https://www.udacity.com/course/time-series-forecasting--ud980

Forecasting is a highly relevant topic for many companies, as it can be the foundation for a reliable and efficient planning process. Yet, it brings in some peculiarities while handling time series data. In this course you will learn the basics of handling such data and how you can apply typical forecasting models to them, like Holt-Winters' method of seasonality of ARIMA models.

If after this you want to take the next step, should check out Prophet, NeuralProphet or DPDHLs fcstlib (becoming open source soon)

 

 


Getting your hands dirty

"For the things we have to learn before we can do them, we learn by doing them."

Aristotle

You can find a lot of teaching and explainations around Data Science, ML and AI on the web - but what you really need is practical experience. Really coming in contact with real-life data and challenges will be the decisive step in you learning journey. So check out these ressources and get started.

Google Colab

Based on Google Cloud Platform infrastructure, Google provides aspiring data scientist with a wonderful environment for doing Data Science. In notebooks forked from project Jupyter you can right-away start coding - without any need to have big computational ressources on your own and install all the software with complex dependencies.

Go to colab.research.google.com →

Kaggle

Kaggle is an online platform for hosting data science competitions. Many companies provide challenging machine learning problems along with data to have them solved by the community of Data Scientists. It is a fantastic opportunity to apply what you've learned.

Kaggle uses their own flavor of Jupyter Notebooks to provide an easy-to-use environement to work on the challenges. As these notebooks can be shared, you can have a glimpse what other Data Scientists are doing and give your own experiments a warm start.

Kaggle also provides a variety of "Beginners" datasets that can be used to learn the basics.

Go to www.kaggle.com →


Proposed Learning Journeys

So what do you make out of this? Where should you start? Depending on your background, you should develop your own plan for your learning journey. 

 

As an algorithmic expert

If you're coming from a background like Machine Learning, Statistics of Mathematics you probably feel comfortable explaining the L2 regularization or the difference between level-wise tree growth vs. leaf-wise tree growth for boosted decision trees. But still, being a data scientists requires much more. 

Coursera specialization in Deep Learning  

TBD

As a Business Expert / Subject Matter Expert

TBD

As a mangager

If you are a manager and feel the need you'd like to have all those buzz words disentangled, you won't need (nor have) an awful lot of time. 

TBD