Table of Contents
- Introduction
- Is Data Science for you
- How to learn
- What to Learn and Where to Learn them from
- Level 1 Fundamentals: Python Programming, SQL, Version Control
- Level 2: Mathematics, Probability, and Statistics
- Level 3: Data Collection and Wrangling Superpowers - NumPy, Pandas, Working with Databases, Transformation
- Level 4: Exploratory Data Analysis, Visualization, Story Telling
- Level 5: Data Engineering
- Level 6: Machine Learning Deep Learning and Artificial Intelligence
- Final Section
- Python Beginner Course
Introduction
If I were to start from the beginning I would have done it another way. I am here to share how you should learn data science; learning from my own experience when I decided to transition into Data Science from an Engineering background in college.
If you’d rather watch a video? Watch the video below, Like it, and Subscribe for more content like this.

Â
If you love reading? then this article is for you.
Â
I’ll be taking you through a step-by-step approach to how I feel you should learn data science, and become a data scientist.
This article will be about how you should learn, What to learn and where to learn them from and how everything plays into the big picture of you landing a data science role.
I want you to also know that these opinions are mine, thinking back on how I could have done it if I were to learn data science from scratch again.
Â
Alright! let’s talk about learning data science.
Â
Quite a number of people want to either start a new career as a Data Scientist or switch careers to Data Science. Chances are you are either learning the wrong way and the result, is that it will create a gap or holes in your knowledge that might hurt your future career.
Is Data Science for you
Before starting out your data science career, there are some questions you’ll want to ask yourself so that you will know if a career in Data Science is actually what you need.
Ask yourself if you enjoy these things:
- Statistics and programming…might be your prior knowledge and experience, or one of your courses in school that you love.
- Working in a tech role that will require constant upskilling and learning of the latest tools and technologies.
- Working in different data roles other than Data Scientist.
If your answer to the above questions is Yes, then you’re on track to learn data science.
How to learn
If you have ever played a game before, you would understand that at the beginning your first task is to complete some sort of beginner level or stage 1 before you can play your way to a much more advanced stage. It is the same for Data Science, you have to hone your skills with the fundamental concepts.
To understand this, let’s take a look at the Pyramid of Data Science Needs, which originally stemmed from the Maslov hierarchy of needs, and we are going to use this to unlock how to learn data science, ultimately becoming a world-class data professional. This is because the Pyramid is really helpful to understand how to align your job responsibilities to your technical skills. This is what I am going to explain in this article to help you learn data science much better.
In the Pyramid of Data Science Needs, every component depends on each other, and you need a good understanding of the base components to be successful as you move up the Pyramid.
Â

Â
The Hierarchy of needs can also show the different careers in data science, and the processes involved with each specialization.
Â
You must know that this is also determined by the company you work with, some companies value specialization where you have a data engineer who is responsible for a different task when you compared to the data scientist or when you compare with the machine learning engineer.
And some companies employ generalization in their data needs, where they only have like two talented individuals on the team, who own the whole project, and they wear different hats. The latter is the most common experience, especially if you find yourself working in startups.
Â
The Pyramid of Data Science will be our guide to determining the best way to learn data science, and you’d be on your path to becoming a full-stack data professional.
Â
If you are just starting out whether you are coming from a different background, you can commit a learning period between 6 months to 12 months to fully pick all the needed concepts. You can do this in less period, it depends on your learning strategy, schedule, and motivation.
Â
Let’s talk about what to learn and where to learn them from.
What to Learn and Where to Learn them from
Level 1 Fundamentals: Python Programming, SQL, Version Control
The first skill that you should pick up is the Python programming language. This would form the basis of your programming skills, and because Python is the most used programming language in data science, I’d say that you are learning a valuable skill. Also, it can come in pretty handy when you want to do other stuff asides from Data Science because Python is a general-purpose language.
My advice here is to learn the language and not just some Python for Data Science crash course that wouldn’t teach the foundation of the Python programming language.
You are expected to know how concepts like data types, functions, control flow, data structures and algorithms, object-oriented programming, and how to work with external libraries.
Good Python programming skills will set you up really nicely for your Data Science career, and you can use the Python skills to pivot into building software perhaps if that’s what you are interested in later in your career.
The next skill to pick up is SQL which is Structured Query Language, basically a language that is used to communicate with databases, and you can pick up concepts like Querying data with SQL statements, filtering data, Joins, aggregations, joins, subqueries et cetera.
For this fundamental stage, the last skill to pick up is version control which essentially is learning how to work with Git and Github. Because as a data scientist, you might find yourself working with teams on a large project, and you have to work with other analysts and engineers, what that means is that you be able to save your changes, and download changes from others. And you do this by working with Git, and Github which is a repository. A repository is basically a central directory location used to store multiple versions of files.
Â
đź“– Learning Resources đź“–
Â
🏗 Project Resources 🏗
After learning, remember the key is to be able to use what you’ve learned. So for the project, you can try extracting data from a website or application by using their Application programming interface and then saving the data into CSV or a database. If you do this, you would have put your Python programming skills into use, and your SQL and Git skills as well.
You can also use platforms like Leetcode, and Hackerrank to practice your Python coding skills, and also your SQL skills by solving daily challenges.
Â
Let’s move to the next level.
Level 2: Mathematics, Probability, and Statistics
The skills that we are going to pick up are Maths, Probability, Statistics, and Linear Algebra.
You don’t really have to be a guru in math, you just have to familiarize yourself with some major topics that will help during your journeys like Statistics, Probability, and linear algebra. They will go a long way when you start delving into machine learning in which mathematical foundations are its core. Statistics, Probability all form the bedrock of Data science. Data science mostly is based on measuring the likelihood of events happening, and deep knowledge of probability will help you in cracking solutions around this. It is even what you might be asked during interviews. As a warning, do not start machine learning if you don’t know the statistics and mathematical methods of common machine learning algorithms.
đź“– Learning Resources đź“–
Let’s talk about the next level which getting your Data Collection and Wrangling superpowers.
Level 3: Data Collection and Wrangling Superpowers - NumPy, Pandas, Working with Databases, Transformation
It is important to know that data won’t come in a format that you can easily work with because the data were collected from different sources, and you need to clean it before you can further analysis on the data.
Here, the skills that we are going to focus on is Data analysis with the Numerical Python library called NumPy, and Pandas library for aggregation and Transformation.
Numpy is a numerical scientific computing package that serves as the foundation for almost all the other Python packages in the Python Data Science ecosystem. Pandas are actually built off Numpy. You can think of it as a super version of Excel that gives the opportunity to clean and analyze data much more effectively. As a Data scientist, you’ll always find yourself using the Pandas library.
Â
đź“– Learning Resources đź“–
Â
🏗 Project Resources 🏗
Data Analytics Projects
Â
Level 4: Exploratory Data Analysis, Visualization, Story Telling
At this stage, your goal is to learn how to communicate insights derived from your data exploration and wrangling activities. You will spend most of the time learning how to define business questions, analyze data to answer your questions, and communicate your findings as visualization, & dashboards.
Visualization libraries that you can use are Matplotlib, Seaborn, Plotly, and Bokeh
Matplotib is actually the first major data visualization library, created to provide visualization API for Python. Other data visualization libraries that you’ll like to adopt are Seaborn, Plotly, and Bokeh. Seaborn works well with Pandas and requires just a few lines of code to create beautiful plots. Plotly and Bokeh are mainly for creating interactive plots with Python. I’ll suggest that you get your hands on these libraries and make a decision on the ones you’ll want to stick with.
[You can show an animation of the logo of the visualization libraries]
Again, Freecodecamp has great resources and projects that you can work on here.
You can also get data from Kaggle to work on some data visualization projects. You might also want to make it personal by extracting data from your LinkedIn page and trying to come up with some data visuals.
Level 5: Data Engineering
Looking at the Pyramid, you’d notice that at the base of the Pyramid, that’s where the Collection, Moving and Storing of Data happens. To do this, you’d need Data Engineering skills. Note that you can skip this part if you’d rather focus on statistical analysis. But if you are the curious one, adding Data Engineering skills to your arsenal of tools will have long term benefits. Also, you can focus on the Data Engineering niche, and it is quite an interesting and lucrative niche. Where you will responsible for engaging in Extract, Transform, and Load processes and also writing automated scripts to move and store data. You work mostly with Python programming language and Structured query language as you are expected to write a lot of queries.
Learning resources for this field are
- There is a great book with detailed concepts - It is called Data Engineering with Python by Paul Crickard on Packtpub. A part of this YouTube channel will focus on teaching concepts in Data Engineering from this book, so subscribe and stay tuned.
Level 6: Machine Learning Deep Learning and Artificial Intelligence
For this video, this is the final level of learning. At this stage, you have the fundamental knowledge that you need to dive into Machine Learning. You have Programming, math, and Statistics knowledge. I say you are ready to start learning Machine learning algorithms. At this stage, you have to get yourself acquainted with the most popular machine learning and deep learning libraries that you’ll work with (Tensorflow, Keras, PyTorch, Scikit-learn). You’ll get yourself with concepts such as classification, regression, dimension reduction, clustering, preprocessing, and model selection. et cetera.
The resources to help you at this level are
- First is the highly recommended book - Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow.
- Free code camp Machine Learning and Deep Learning Courses on YouTube.
You want to make sure that you work on projects, Kaggle is a great repository for projects, and competitions that you can work on and learn best practices from.
Final Section
So in this video, I have highlighted 6 levels of learning that I would have pursued if I were to start again. We talked about the need of making sure that your foundation in programming is solid, and follow project-based learning. I promised that I have something for you if you stick to the end of the video. Well here it is, I have put together a document that contains the whole roadmap, learning resources, and projects. It’s free, so check the description for a download link.
Here is my take on how to data science, let me know what you think in the comment section, subscribe for more programming videos, and like and share with your friends. My name is Blessing, and I will see you in the next one.
Â
Python Beginner Course
Â
BasicsÂ
Data Types