What Is a Classification Algorithm?
Classification algorithms can be used for speech recognition, handwriting recognition, or credit approval.
Machines learning to classify new data
Classification is a subcategory of supervised learning in which, put simply, the computer learns from data inputs so as to be able to classify new data. Based on past observations, the algorithm can predict which class new information should fall. Classification can be binary (a common example is spam detection, where an email is either spam or non-spam); multi-class (an animal can be a dog, or a car, or a cow, but only one of those things at a time); or multi-label ("a news article can be about sports, a person, and location at the same time," as explains an Edureka article).
Classification algorithms can be used for speech recognition, handwriting recognition, credit approval, document classification, as well as detecting pedestrians in a self-driving car, identifying cancer cells, or assessing the likelihood of an online customer buying a product.
The 7 main types of classification algorithms
There are different kinds of classification algorithms, among which:
- Naive Bayes, a classification technique based on Bayes' Theorem, which assumes independence among the features in a class. This algorithm is simple and yields good results with large datasets;
- Logistic regression, a statistical method based on a logistic model, which assesses the probability of a certain class existing;
- Nearest neighbour, a supervised algorithm that labels new points according to the labels that their neighbouring points vote on;
- Support vector machines, in which the training data is represented as "points in space separated into categories by a clear gap (...). New examples are mapped into that same space and predicted to belong to a category based on which side of the gap they fall," explains this detailed article;
- Decision Trees, where the data is broken down into "smaller and smaller subsets" as it passes through decision nodes;
- Random Forests, an ensemble of trees that classifies objects according to the vote of these trees;
- Neural networks, which we have already covered in this article.
How to choose between different classification algorithms
All of these algorithms have their own advantages and drawbacks. Deciding which one to use for a specific classification purpose will often depend on the nature of the data set and the desired outcome. To facilitate the choice, this article by Analytics in India even provides a decision tree.
Stay in the loop
Subscribe to our email newsletter. No spam, just occasional insight from our experts.