Ada Boost

Description

AdaBoost (Adaptive Boosting) is a powerful classifier that works well on both basic and more complex recognition problems. AdaBoost works by creating a highly accurate classifier by combining many relatively weak and inaccurate classifiers. AdaBoost therefore acts as a meta algorithm, which allows you to use it as a wrapper for other classifiers. In the GRT, these classifiers are called Weak Classifiers such as a DecisionStump (which is just one node of a DecisionTree). AdaBoost is adaptive in the sense that subsequent classifiers added at each round of boosting are tweaked in favor of those instances misclassified by previous classifiers. The default number of boosting rounds for AdaBoost is 20, however this can easily be set using the setNumBoostingIterations(UINT numBoostingIterations) function or via the AdaBoost constructor.

AdaBoost is part of the GRT classification modules.

Advantages

AdaBoost is a powerful classification algorithm that has enjoyed practical success with applications in a wide variety of fields, such as biology, computer vision, and speech processing. Unlike other powerful classifiers, such as SVM, AdaBoost can achieve similar classification results with much less tweaking of parameters or settings (unless of course you choose to use SVM with AdaBoost). The user only needs to choose: (1) which weak classifier might work best to solve their given classification problem; (2) the number of boosting rounds that should be used during the training phase. The GRT enables a user to add several weak classifiers to the family of weak classifiers that should be used at each round of boosting. The AdaBoost algorithm will select the weak classifier that works best at that round of boosting.

Disadvantages

AdaBoost can be sensitive to noisy data and outliers. In some problems, however, it can be less susceptible to the overfitting problem than most learning algorithms. The GRT AdaBoost algorithm does not currently support null rejection, although this will be added at some point in the near future.

Training Data Format

You should use the LabelledClassificationData data structure to train the AdaBoost classifier.

Example Code

This examples demonstrates how to initialize, train, and use the AdaBoost algorithm for classification. This example uses the DecisionStump as the WeakClassifer, although AdaBoost works with any GRT WeakClassifier (including any you write yourself). The example loads the data shown in the image below and uses this to train a classification model. The data is a recording of a Wii-mote being held in 5 different orientations, the top graph shows the raw accelerometer data from the recording (showing the x, y, and z accelerometer data), while the bottom graph shows the label recorded for each sample (you can see the 5 different classes in the label data). You can download the actual dataset in the Code & Resources section below.

AdaBoost Training Data
The data is a recording of a Wii-mote being held in 5 different orientations, the top graph shows the raw accelerometer data from the recording (showing the x, y, and z accelerometer data), while the bottom graph shows the label recorded for each sample (you can see the 5 different classes in the label data). WiiAccelerometerData.jpg
/*
 GRT AdaBoost Example
 This examples demonstrates how to initialize, train, and use the AdaBoost algorithm for classification.

 AdaBoost (Adaptive Boosting) is a powerful classifier that works well on both basic and more complex recognition problems.
 AdaBoost works by combining several relatively weak classifiers together to form a highly accurate strong classifier.  AdaBoost
 therefore acts as a meta algorithm, which allows you to use it as a wrapper for other classifiers.  In the GRT, these classifiers
 are called WeakClassifiers such as a DecisionStump (which is just one node of a DecisionTree).

 In this example we create an instance of an AdaBoost algorithm and then train the algorithm using some pre-recorded training data.
 The trained AdaBoost algorithm is then used to predict the class label of some test data.
 This example uses the DecisionStump as the WeakClassifer, although AdaBoost works with any GRT WeakClassifier (including any you write yourself).

 This example shows you how to:
 - Create an initialize the AdaBoost algorithm
 - Set a DecisionStump as the WeakClassifer
 - Load some LabelledClassificationData from a file and partition the training data into a training dataset and a test dataset
 - Train the AdaBoost algorithm using the training dataset
 - Test the AdaBoost algorithm using the test dataset
 - Manually compute the accuracy of the classifier
*/


//You might need to set the specific path of the GRT header relative to your project
#include "GRT.h"
using namespace GRT;

int main (int argc, const char * argv[])
{
    //Create a new AdaBoost instance
    AdaBoost adaBoost;

    //Set the weak classifier you want to use
    adaBoost.setWeakClassifier( DecisionStump() );

    //Load some training data to train the classifier
    LabelledClassificationData trainingData;

    if( !trainingData.loadDatasetFromFile("AdaBoostTrainingData.txt") ){
        cout << "Failed to load training data!\n";
        return EXIT_FAILURE;
    }

    //Use 20% of the training dataset to create a test dataset
    LabelledClassificationData testData = trainingData.partition( 80 );

    //Train the classifier
    if( !adaBoost.train( trainingData ) ){
        cout << "Failed to train classifier!\n";
        return EXIT_FAILURE;
    }

    //Save the model to a file
    if( !adaBoost.saveModelToFile("AdaBoostModel.txt") ){
        cout << "Failed to save the classifier model!\n";
        return EXIT_FAILURE;
    }

    //Load the model from a file
    if( !adaBoost.loadModelFromFile("AdaBoostModel.txt") ){
        cout << "Failed to load the classifier model!\n";
        return EXIT_FAILURE;
    }

    //Use the test dataset to test the AdaBoost model
    double accuracy = 0;
    for(UINT i=0; i<testData.getNumSamples(); i++){
        //Get the i'th test sample
        UINT classLabel = testData[i].getClassLabel();
        vector< double > inputVector = testData[i].getSample();

        //Perform a prediction using the classifier
        if( !adaBoost.predict( inputVector ) ){
            cout << "Failed to perform prediction for test sampel: " << i <<"\n";
            return EXIT_FAILURE;
        }

        //Get the predicted class label
        UINT predictedClassLabel = adaBoost.getPredictedClassLabel();
        double maximumLikelhood = adaBoost.getMaximumLikelihood();
        vector< double > classLikelihoods = adaBoost.getClassLikelihoods();
        vector< double > classDistances = adaBoost.getClassDistances();

        //Update the accuracy
        if( classLabel == predictedClassLabel ) accuracy++;

        cout << "TestSample: " << i <<  " ClassLabel: " << classLabel;
        cout << " PredictedClassLabel: " << predictedClassLabel << " Likelihood: " << maximumLikelhood;
        cout << endl;
    }

    cout << "Test Accuracy: " << accuracy/double(testData.getNumSamples())*100.0 << "%" << endl;

    return EXIT_SUCCESS;
}

Code & Resources

AdaBoostExample.cpp AdaBoostTrainingData.txt

Documentation

You can find the documentation for this class at AdaBoost documentation.