Main menu


Machine learning Introduction to machine learning


Machine learning

Machine learning  Introduction to machine learning

Introduction to machine learning

Computers that figure out what to do without talking have been the long awaited thing.

The idea of ​​being able to drive, identifying pedestrians and potholes, and reacting quickly and efficiently to changes in the environment to get them safely to their destinations—this is machine learning.

How It Works Let's start with analyzing business data.

ML is a kind of AI that can understand and learn from large amounts of data. Take Twitter as an example. According to Internet Live Stats, Twitter users send about 500 million tweets every day, which is equivalent to about 200 billion tweets per year. It is humanly impossible to analyze, classify, classify, learn and predict anything with that number of tweets.

Machine learning requires a significant amount of work for businesses to obtain valuable information. To get the most out of ML, you need to have clean data and know what questions you have. You can then choose the model and algorithm that best suits your business. ML is not a simple or easy process. The success of ML requires constant effort.

ML has a life cycle.

·         understand. Reasons to turn to and learn from ML

·         Data collection and cleanup. It has the required amount of data and is clean enough to provide insight.

·         function selection. It involves determining the data that needs to be provided to ML to build an ML model. Depending on the type of algorithm, there are different methods available for selecting features. For example, let's say you are using a decision tree algorithm. In this case, the analyst or modeling tool can apply an "interesting score", a column from the database, to determine whether that data should be used to build the model.

·         Select model. Select files (models) that are trained to process and find specific items in your data. The model is given algorithms to work with, and by testing the data, you can combine the two to draw conclusions.

·         training and tuning. The conclusion is that depending on the model, the question can be answered.

·         You need to evaluate your models and algorithms to see if they are ready for use, or if you need to go back a few steps and refine your models, features, algorithms or data to achieve your goals,

·         Deploy the trained model to production.

·         Review the output of an existing model in production 


What is machine learning used for? machine learning applications

Machine learning is a way for businesses to understand data and learn from it. There are many sub-fields available to businesses. Depending on the use case, whether it's increasing revenue, providing search capabilities, integrating voice commands into products, or self-driving cars.

machine learning subfield

ML is used a lot in business today, and its use will grow even more. Subfields of ML include social media and product recommendations, image recognition, health checkups, language translation, speech recognition, data mining, and more.

Social media platforms like Facebook, Instagram, or LinkedIn are also using ML to suggest pages to follow or groups to join, based on posts you like. Get historical data about posts others liked or similar to yours and suggest or add to your feed.  

Ecommerce sites can also use ML to recommend products based on previous purchases, your searches, and the behavior of other users similar to you.

An important use of ML today is image recognition. Social media platforms encouraged tagging people in photos based on ML. Police could use it to find suspects in photos or videos. A plethora of cameras installed at airports, shops and doorbells can help you figure out who committed the crime or where the offender went.

Medical checkups also make good use of ML. After an event like a heart attack, you can go back and see warning signs that you might have overlooked. Systems used by hospitals can learn to see connections from inputs (behaviours, test results, or symptoms) to outputs (such as a heart attack) given their past medical records. The doctor can then enter future notes and test results into the system, allowing the machine to detect heart attack symptoms much more reliably than humans.

Language translation of web pages or apps for mobile platforms is another example of ML. Some apps perform more advanced tasks than others depending on the ML models, techniques, and algorithms they use.

Today, everyday use of ML is in banks and credit cards. ML can detect quickly, but it takes a long time for humans to discover. Transactions that are heavily inspected and labeled (false or not) allow ML to learn how to identify fraud in a single future transaction. A great ML for this is data mining.

data mining

Data mining is a type of ML that analyzes data to make predictions or discover patterns within big data. The term is a bit misleading because it doesn't require anyone, be it a malicious actor or an employee. Instead, the process involves discovering patterns in the data that will be needed to make decisions in the future.

Consider, for example, a credit card company. Your bank may be aware of suspicious activity on your card. How could a bank detect such activity so quickly and send an almost instantaneous alert? What makes this fraud prevention possible is continuous data mining. As of early 2020, there were over 1.1 trillion cards issued in the United States alone. The number of transactions on that card generates a wealth of data for mining, pattern detection, and learning to identify future suspicious transactions.

deep learning

Deep learning is a specific type of ML based on neural networks. Neural networks are responsible for mimicking how neurons in the human brain function to make certain decisions or to understand something. For example, a 6-year-old would be able to tell her mother apart from a crosswalk by looking at her face, because a 6-year-old would quickly analyze many details such as her hair color, facial features, scars, and more, all in the blink of an eye. can. Machine learning replicates this in the form of deep learning.

A neural network consists of 3-5 layers (input layer, 1-3 hidden layers, output layer). Hidden ones make decisions one by one towards the output layer or conclusion. what hair color? what eye color? Are there any scars? As it grows into hundreds of layers, this is called deep learning.

Types of Machine Learning

There are basically four types of machine learning algorithms: supervised, semi-supervised, unsupervised, and reinforcement. ML experts estimate that about 70% of ML algorithms in use today are supervised. Work with known or labeled datasets, such as pictures of dogs and cats. As both types of animals are known, administrators can label photos before providing them to the algorithm.

Unsupervised ML algorithms learn from unknown datasets. Take the TikTok video as an example. There are so many videos with so many topics that it is impossible to train the algorithm in a supervised way. The data is not yet labeled.

A semi-supervised ML algorithm is initially trained on a small, known and labeled dataset. It is then applied to a larger, unlabeled data set to continue training.

Enhanced ML algorithms are not initially trained. They learn by trial and error on the go. Think of a robot that learns to navigate a pile of rocks. Each time you fall, you learn what doesn't work and change your behavior until you succeed. Think about dog training and the use of treats to teach different commands. With positive reinforcement, the dog continues to carry out commands and changes behaviors that do not display a favorable response.  

Supervised and unsupervised machine learning

supervised machine learning

Find patterns using known, established, and classified data sets. Expand on previous ideas for photos of dogs and cats. You can have huge data sets with thousands of different animals in millions of photos. Because the animal types are known, they can be grouped, labeled, and passed to a supervised ML algorithm to learn how to understand them.

The supervisory algorithm now compares the input to the output and the picture to the label of the animal type. It will eventually learn to recognize certain kinds of animals in new pictures it encounters.

unsupervised machine learning

Unsupervised ML algorithms are like spam filters today. Initially, administrators could program a spam filter to understand spam by searching for specific words in emails. But that's not possible anymore, and it works fine without supervision in ML. An unsupervised ML algorithm is fed with unlabeled emails looking for patterns. When these patterns are found, you learn what spam looks like and identify it in production. 

machine learning technology

ML technology solves problems. Choose a specific ML technique based on the problem you are facing. Here are 6 common ones.

regression technique

Regression can be used to predict home market prices in Minnesota for December or to determine the optimal selling price for household goods. According to regression, even if the price fluctuates, it always returns to the average price. You can plot the price over time on a graph and find the average over time. If the red line continues to rise above the chart, it is possible to predict the future.


As expected, you can find good customers (who always come back and spend a lot of money) or, as expected, customers to start shopping elsewhere. If you can look back over time and find predictors for each customer segment, you can apply them to your current customers and predict which group they will fit into. This allows you to market more effectively and convert potentially leaving customers into excellent returning customers. 


Unlike classification techniques, clustering is unsupervised ML. In clustering, the system finds a way to group data that it doesn't know how to group. 

 Google uses clustering for generalization, data compression, and privacy for products such as YouTube videos, Play apps, and music tracks.

Anomaly detection

Anomaly detection is used to find outliers, such as finding black sheep in a herd. Given the vast amount of data, these anomalies are impossible for humans to detect. But if, for example, a data scientist has provided system medical billing data from many hospitals, anomaly detection finds a way to group claims. You can discover a set of outliers that turn out to be where the fraud is taking place.

Shopping Cart Analysis

The logic of shopping cart analysis makes it possible to predict the future. A simple example - if a customer puts ground beef, tomatoes, and tacos in a basket, you can predict that they will add cheese and sour cream. These predictions can help online shoppers make valuable offers on items they might have forgotten, or help group products in stores, generating additional sales.

Two MIT professors used this approach to discover “pioneers of failure .” As a result, some customers like a product that has failed. When they find this, they know whether to keep selling the product and what marketing to do to increase sales to the right customers. You can decide to apply.

time series data

Time series data is typically collected through a fitness monitor on the wrist. It can collect heartbeats per minute, how many steps we take per minute or hour, and some even measure oxygen saturation over time. With this data, you can predict when someone will run in the future. Because of time-based data on vibration levels, dB noise levels, and pressures, you can also collect data about machines and predict failures.

machine learning algorithms

If ML needs to learn from data, how do you design algorithms to learn and find statistically significant data? ML algorithms support the process of supervised, unsupervised, or augmented ML.

Let's look at some of the most common specific algorithms. Here are the top 5 currently in use.

·         A linear regression algorithm establishes a relationship by fitting the independent and dependent variables to a graph and plotting a straight line for the mean or trend. Merriam-Webster defines regression as a function that produces the mean value of a random variable, given that one or more independent variables have specified values. This definition also applies to logistic regression.

Machine learning  Introduction to machine learning


·         Logistic (aka logit) regression, like linear regression, fits a variable to a graph, but the lines are not linear. This line is a sigmoid function.

Machine learning  Introduction to machine learning

·         Decision trees are very commonly used algorithms within supervised ML. Used to classify data based on categorical and continuous variables.

Machine learning  Introduction to machine learning

·         Support Vector Machine draws a hyperplane based on the two nearest data points. This encloses the class to separate the data. Classify data based on N-dimensional space. N represents the number of different features it has.

Machine learning  Introduction to machine learning

·         Naive Bayes calculates the probability of a particular outcome. This is very effective and outperforms more sophisticated classification models. A naive Bayesian classifier model will understand that a given feature is not related to the presence of any other particular ffeature.

Machine learning  Introduction to machine learning

machine learning model  

After combining ML types (supervised, unsupervised, etc.), techniques and algorithms, the result is a trained file. This file can now be fed with new data and can recognize patterns and make predictions or decisions as needed for the business, manager or customer.

Best language for machine learning

A machine learning language is a way to write instructions for a system to learn. Each language has a community of users for support to learn from or guide others. Libraries are included with each language for machine learning use.

Here are the top 10 according to our 2019 GitHub Top 10 survey .

·         Python

·         C++

·         JavaScript

·         Java

·         C#

·         Julia

·         Shell

·         R

·         TypeScript

·         Scala - the language used to interact with big data

Python machine learning

Python is the most common ML language, so I'll go into more detail here.

Python is an open source object-oriented language named after Monty Python . Because it is interpreted, it is converted to bytecode before execution in the Python virtual machine.

There are a number of features that make Python a favorite for ML.

·         A powerful set of packages currently available. There are certain ML packages like numpy, scipy and panda.

·         Prototypes can be created quickly and easily.

·         There are a variety of tools that enable collaboration.

·         As data scientists move from extracting to modeling to updating ML solutions, Python may continue to be the language of choice. Data scientists don't have to change languages ​​as they move through their lifecycles.