Min Dist

Description

The MinDist (Minimum Distance) algorithm is a simple classifier that works very well on both basic and more complex recognition problems. The MinDist algorithm is a very fast classifier and so is a good choice if you need to a classifier with a low-computional overhead.

The MinDist algorithm is a supervised learning algorithm that can be used to classify any type of N-dimensional signal. The MinDist algorithm works by fitting M clusters to the data from each class during the training phase. A new sample is then classified by finding the class that has the cluster with the minimum distance (Euclidean) to the new sample. This makes the MinDist algorithm particularly fast at classifying new samples, compared with other classification algorithms such as KNN.

The MinDist algorithm also computes rejection thresholds that enable the algorithm to automatically reject sensor values that are not the K gestures the algorithm has been trained to recognized (without being explicitly told during the prediction phase if a gesture is, or is not, being performed).

The MinDist algorithm is part of the GRT classification modules.

Advantages

The MinDist algorithm is a very good algorithm to use for the classification of static postures and non-temporal pattern recognition. The MinDist algorithm is a particularly fast classifier.

Disadvantages

The main limitation of the MinDist algorithm is that choosing the "wrong" number of clusters may result in a poor classification result. The user may therefore want to train the algorithm with several different cluster values (i.e. 2, 5, 10, 100, etc.) and run cross validation to determine a "good" cluster value. A function to estimate a suitable cluster value will be added soon.

Training Data Format

You should use the ClassificationData data structure to train the MinDist classifier.

Example Code

This examples demonstrates how to initialize, train, and use the MinDist algorithm for classification. The example loads the data shown in the image below and uses this to train the MinDist algorithm. The data is a recording of a Wii-mote being held in 5 different orientations, the top graph shows the raw accelerometer data from the recording (showing the x, y, and z accelerometer data), while the bottom graph shows the label recorded for each sample (you can see the 5 different classes in the label data). You can download the actual dataset in the Code & Resources section below.

MinDist Training Data
The data is a recording of a Wii-mote being held in 5 different orientations, the top graph shows the raw accelerometer data from the recording (showing the x, y, and z accelerometer data), while the bottom graph shows the label recorded for each sample (you can see the 5 different classes in the label data). WiiAccelerometerData.jpg
/*
 GRT MinDist Example
 This examples demonstrates how to initialize, train, and use the MinDist algorithm for classification.

 The MinDist (Minimum Distance) algorithm is a simple classifier that works very well on both basic and more complex recognition problems.
 The MinDist algorithm is a very fast classifier and so is a good choice if you need to a classifier with a low-computional overhead.

 In this example we create an instance of a MinDist classifier and then train the algorithm using some pre-recorded training data.
 The trained MinDist algorithm is then used to predict the class label of some test data.

 This example shows you how to:
 - Create an initialize the MinDist algorithm
 - Load some ClassificationData from a file and partition the training data into a training dataset and a test dataset
 - Train the MinDist algorithm using the training dataset
 - Test the MinDist algorithm using the test dataset
 - Manually compute the accuracy of the classifier
*/


//You might need to set the specific path of the GRT header relative to your project
#include "GRT.h"
using namespace GRT;

int main (int argc, const char * argv[])
{
    //Create a new MinDist instance, using the default parameters
    MinDist minDist;

    //Set how many clusters the minDist algorithm will use when training the minDist classification model
    //If your classification problem is more complex, then you might want to use a larger number of clusters
    minDist.setNumClusters( 2 );

    //Load some training data to train the classifier
    ClassificationData trainingData;

    if( !trainingData.loadDatasetFromFile("MinDistTrainingData.txt") ){
        cout << "Failed to load training data!\n";
        return EXIT_FAILURE;
    }

    //Use 20% of the training dataset to create a test dataset
    ClassificationData testData = trainingData.partition( 80 );

    //Train the classifier
    if( !minDist.train( trainingData ) ){
        cout << "Failed to train classifier!\n";
        return EXIT_FAILURE;
    }

    //Save the MinDist model to a file
    if( !minDist.saveModelToFile("MinDistModel.txt") ){
        cout << "Failed to save the classifier model!\n";
        return EXIT_FAILURE;
    }

    //Load the MinDist model from a file
    if( !minDist.loadModelFromFile("MinDistModel.txt") ){
        cout << "Failed to load the classifier model!\n";
        return EXIT_FAILURE;
    }

    //Use the test dataset to test the MinDist model
    double accuracy = 0;
    for(UINT i=0; i<testData.getNumSamples(); i++){
        //Get the i'th test sample
        UINT classLabel = testData[i].getClassLabel();
        vector< double > inputVector = testData[i].getSample();

        //Perform a prediction using the classifier
        bool predictSuccess = minDist.predict( inputVector );

        if( !predictSuccess ){
            cout << "Failed to perform prediction for test sampel: " << i <<"\n";
            return EXIT_FAILURE;
        }

        //Get the predicted class label
        UINT predictedClassLabel = minDist.getPredictedClassLabel();
                double maxLikelihood = minDist.getMaximumLikelihood();
        vector< double > classLikelihoods = minDist.getClassLikelihoods();
        vector< double > classDistances = minDist.getClassDistances();

        //Update the accuracy
        if( classLabel == predictedClassLabel ) accuracy++;

        cout << "TestSample: " << i <<  "\tClassLabel: " << classLabel << "\tPredictedClassLabel: " << predictedClassLabel << "\tLikelihood: " << maxLikelihood << endl;
    }

    cout << "Test Accuracy: " << accuracy/double(testData.getNumSamples())*100.0 << "%" << endl;

    return EXIT_SUCCESS;
}

Code & Resources

MinDistExample.cpp MinDistTrainingData.txt

Documentation

You can find the documentation for this class at MinDist documentation.