10 things you need to know to become a data scientist

If you've been browsing through job postings recently, you've probably seen the sheer amount of positions available for data scientists. Demand appears to be far greater than supply, which means there is a huge opportunity here. But there seems to be a pitfall. Most of these positions require some experience or knowledge in data science. How can you improve your skills to become a data scientist if you want to mid-career?

Well, today I will try to answer this question.

What is data science?

Before we dive into how to become a data scientist, let's first briefly review what exactly data science is.

We all have the so-called explosion of data . More and more data is being collected via the web, mobile apps, fitness devices, and more. Collectively, this is called Big Data. However, big data does not simply mean the amount of data, it means fast and diverse data.

Data science is the skills and skills needed to make sense of all this data. These include advanced analytics, data mining, machine learning, data visualization, and statistics. The ability to derive insights from raw data to solve real-world problems.

According to Gartner report, Critical Capabilities for Operational Database Management Systems 2015:

By 2017, all leading operational DBMSs will offer multiple data models, relational and NoSQL, in a single DBMS platform.

We can already see this in SQL Server 2016 which includes:

R service

R services allow data scientists and analysts to run statistical programming queries directly on the database. It uses multiple cores, processors and threads to support very fast computations.

poly base

PolyBase acts as a gateway between SQL Server and Hadoop or Azure Blob storage, so you can use Transact-SQL to query non-relational data the same way you query relational data in a database.

Power BI

PowerBI is tightly integrated with SQL Server, making it easy to analyze and share data insights and create rich visualizations.

Cortana Intelligence Suite on Azure

The Cortana intelligence suite combines big data with advanced analytics to get actionable intelligence from your data. You can create models with Azure Machine Learning and analyze data in Azure Data Lake or SQL Data Warehouse using Azure Data Lake Analytics or Azure Stream Analytics. It is a powerful tool to use with Cortana to name a few.

With this in mind, SQL Server professionals already have access to the tools they need to become data scientists.

Here's what Azure Machine Learning Studio looks like: You can try it for free by going to Link Studio and clicking the Start button.

Numerous useful resources are available to help you get started, including interactive tutorials.

What you need to know to become a data scientist

1.You need to understand the data. Knows how to navigate it and how to use statistical and analytical techniques.

2.You should be able to use Transact-SQL to query and manipulate data sets in the format you need.

3.You should be able to present data in a meaningful way using tools like Excel or Power BI.

4.You need to understand statistics and their role in gaining insight from your data.

5.You need to know how to use a statistical programming language like R or Python.

6.You should be able to transform your data, clean it up, and do some statistical analysis.

7.You should understand data science concepts such as machine learning, algorithms, conditional probability, etc.

8.You need to know how to create and evaluate machine learning models.

9.You should be able to use machine learning to generate predictions and solve problems.

10.You should learn how to use tools like Microsoft Azure HDInsight, Scala, Spark, etc.

I know this is quite difficult. But it can be achieved with a little effort and dedication. And luckily, there are now several resources available to help you in your quest to become a data scientist.

So, how can I prove to prospective employers that I am now a data scientist?

Recognizing that there is an extreme shortage of data scientists, Microsoft has embarked on a mission to promote data science research for those who want to embrace this new and exciting career opportunity.

So they started their first Microsoft Professional Degree in Data Science on the 22nd. nd August 2016.

This course is designed in collaboration with employers and top universities such as Columbia and Harvard and is available on

The degree programs available at consist of 4 units.


Here you will learn the basics such as querying and visualizing data. There are 3 compulsory courses in this unit and 1 optional course where you can choose to use Excel or PowerBI.

core data science

In this unit, you will learn how to use the statistical programming language. You can choose between Python or R.

applied data science

In this lesson, you will learn advanced techniques for extracting meaningful insights from data using Python or R.

Cortana Intelligence Competition

Finally, you will complete a real-world project that will be scored and graded to prove your recently acquired skills and ultimately award you a data science degree.


Microsoft estimates that there are about 1.5 million jobs available for data scientists. Looking at the skills you need to become a data scientist can affect your sales. However, fortunately, various universities and companies have recognized the skills shortage and have launched programs to fill this gap.

Microsoft offers its own degree programs developed by industry experts and academics, which will open the door to many aspiring data scientists.