MLP

Description

The Multi Layer Perceptron (MLP) algorithm is a powerful form of an Artificial Neural Network that is commonly used for regression (and can also be used for classification ).

The MLP algorithm is a supervised learning algorithm that can be used for both classification and regression for any type of N-dimensional signal.

The MLP algorithm is part of the GRT regression modules.

Advantages

The MLP algorithm is a very good algorithm to use for the regression and mapping. It can be used to map an N-dimensional input signal to an M-dimensional output signal, this mapping can also be non-linear.

Disadvantages

The main limitation of the MLP algorithm is that, because of the way it is trained, it can not guarantee that the minima it stops at during training is the global minima. The MLP algorithm can, therefore, get stuck in a local minima. One option for (somewhat) mitigating this is to train the MLP algorithm several times, using a different random starting position each time, and then pick the model that results in the best RMS error. The number of random training iterations can be set using the setNumRandomTrainingIterations(UINT numRandomTrainingIterations) method, setting this value to a higher number of random training iterations (i.e. 20) may result in a better classification or regression model, however, this will increase the total training time. Another limitation of the MLP algorithm is that the number of Hidden Neurons must be set by the user, setting this value too low may result in the MLP model underfitting while setting this value too high may result in the MLP model overfitting.

Things To Know

You should always enable scaling with the MLP, as this will give you much better results.

Training Data Format

You should use the LabelledClassificationData data structure to train the MLP for classification and the LabelledRegressionData data structure to train the MLP for regression.

Example Code

This examples demonstrates how to initialize, train, and use the MLP algorithm for regression.

The example loads the data shown in the image below and uses this to train the MLP algorithm. The data consists of the 3-axis gyro data from a Wii-mote, which has been labelled with a 1-dimensional target value (i.e. the value the MLP model should output given the 3-dimensional gyro input). The purpose of this exercise is to see if the MLP algorithm can learn to map the values of the Pitch axis of the gyro between 0 and 1, without the Roll and Yaw values corrupting the mapped output value. The Wii-mote was rotated left (around the Pitch axis) and then several seconds of training data was recorded with the target label set to 0. During this time the Wii-mote was moved around the Roll and Yaw axes, but not around the Pitch axis (as much as possible). The label was then changed to 1 and the Wii-mote was rotated right (around the Pitch axis) and then several seconds of training data was record, again the Wii-mote was moved around the Roll and Yaw axes, but not around the Pitch axis. This process was then repeated again to record the test dataset.

The images below show the recorded gyro and target values for both the training and test datasets. In the training and test images you can see the raw gyro data in the top row (with red = Roll, green = Pitch, blue = Yaw) and the target values in the bottom row.

You can download the training and test datasets in the Code & Resources section below.

MLP Training Data
Gyro Training Data: The data consists of the 3-axis gyro data from a Wii-mote, which has been labelled with a 1-dimensional target value (i.e. the value the MLP model should output given the 3-dimensional gyro input). The raw gyro data is in the top row (with red = Roll, green = Pitch, blue = Yaw) and the target value is in the bottom row. MLPRegressionTrainingDataImage1.jpg
MLP Test Data
Gyro Test Data: The data consists of the 3-axis gyro data from a Wii-mote, which has been labelled with a 1-dimensional target value (i.e. the value the MLP model should output given the 3-dimensional gyro input). The raw gyro data is in the top row (with red = Roll, green = Pitch, blue = Yaw) and the target value is in the bottom row. MLPRegressionTestDataImage1.jpg
MLP Results Data
MLP Results Data: This image show the output of the trained MLP (in blue) along with the target value (in green) from the test data set. You can see that the MLP effectively learns to map the 3-axis gyro input to the 1 dimensional target output, although the 'noise' from the Roll and Yaw axes still have some influence on the mapping of the Pitch axis. The RMS error of this test data was 0.06. MLPRegressionOutputResultsImage1.jpg
/*
 GRT MLP Regression Example
 This examples demonstrates how to initialize, train, and use the MLP algorithm for regression.

 The Multi Layer Perceptron (MLP) algorithm is a powerful form of an Artificial Neural Network that is commonly used for regression.

 In this example we create an instance of an MLP algorithm and then train the algorithm using some pre-recorded training data.
 The trained MLP algorithm is then used to perform regression on the test data.

 This example shows you how to:
 - Create an initialize the MLP algorithm for regression
 - Create a new instance of a GestureRecognitionPipeline and add the regression instance to the pipeline
 - Load some LabelledRegressionData from a file
 - Train the MLP algorithm using the training dataset
 - Test the MLP algorithm using the test dataset
 - Save the output of the MLP algorithm to a file
*/


#include "GRT.h"
using namespace GRT;

int main (int argc, const char * argv[])
{
    //Turn on the training log so we can print the training status of the MLP to the screen
    TrainingLog::enableLogging( true );

    //Load the training data
    LabelledRegressionData trainingData;
    LabelledRegressionData testData;

    if( !trainingData.loadDatasetFromFile("MLPRegressionTrainingData.txt") ){
        cout << "ERROR: Failed to load training data!\n";
        return EXIT_FAILURE;
    }

    if( !testData.loadDatasetFromFile("MLPRegressionTestData.txt") ){
        cout << "ERROR: Failed to load test data!\n";
        return EXIT_FAILURE;
    }

    //Make sure the dimensionality of the training and test data matches
    if( trainingData.getNumInputDimensions() != testData.getNumInputDimensions() ){
        cout << "ERROR: The number of input dimensions in the training data (" << trainingData.getNumInputDimensions() << ")";
        cout << " does not match the number of input dimensions in the test data (" << testData.getNumInputDimensions() << ")\n";
        return EXIT_FAILURE;
    }

    if( testData.getNumTargetDimensions() != testData.getNumTargetDimensions() ){
        cout << "ERROR: The number of target dimensions in the training data (" << testData.getNumTargetDimensions() << ")";
        cout << " does not match the number of target dimensions in the test data (" << testData.getNumTargetDimensions() << ")\n";
        return EXIT_FAILURE;
    }

    cout << "Training and Test datasets loaded\n";

    //Print the stats of the datasets
    cout << "Training data stats:\n";
    trainingData.printStats();

    cout << "Test data stats:\n";
    testData.printStats();

    //Create a new gesture recognition pipeline
    GestureRecognitionPipeline pipeline;

    //Setup the MLP, the number of input and output neurons must match the dimensionality of the training/test datasets
    MLP mlp;
    unsigned int numInputNeurons = trainingData.getNumInputDimensions();
    unsigned int numHiddenNeurons = 5;
    unsigned int numOutputNeurons = trainingData.getNumTargetDimensions();

    //Initialize the MLP
    mlp.init(numInputNeurons, numHiddenNeurons, numOutputNeurons);

    //Set the training settings
    mlp.setMaxNumEpochs( 500 ); //This sets the maximum number of epochs (1 epoch is 1 complete iteration of the training data) that are allowed
    mlp.setMinChange( 1.0e-5 ); //This sets the minimum change allowed in training error between any two epochs
    mlp.setNumRandomTrainingIterations( 20 ); //This sets the number of times the MLP will be trained, each training iteration starts with new random values
    mlp.setUseValidationSet( true ); //This sets aside a small portiion of the training data to be used as a validation set to mitigate overfitting
    mlp.setValidationSetSize( 20 ); //Use 20% of the training data for validation during the training phase
    mlp.setRandomiseTrainingOrder( true ); //Randomize the order of the training data so that the training algorithm does not bias the training

    //The MLP generally works much better if the training and prediction data is first scaled to a common range (i.e. [0.0 1.0])
    mlp.enableScaling( true );

    //Add the MLP to the pipeline
    pipeline.setRegressifier( mlp );

    //Train the MLP model
    cout << "Training MLP model...\n";
    if( !pipeline.train( trainingData ) ){
        cout << "ERROR: Failed to train MLP model!\n";
        return EXIT_FAILURE;
    }

    cout << "Model trained.\n";

    //Test the model
    cout << "Testing MLP model...\n";
    if( !pipeline.test( testData ) ){
        cout << "ERROR: Failed to test MLP model!\n";
        return EXIT_FAILURE;
    }

    cout << "Test complete. Test RMS error: " << pipeline.getTestRMSError() << endl;

    //Run back over the test data again and output the results to a file
    fstream file;
    file.open("MLPResultsData.txt", fstream::out);

    for(UINT i=0; i<testData.getNumSamples(); i++){
        vector< double > inputVector = testData[i].getInputVector();
        vector< double > targetVector = testData[i].getTargetVector();

        //Map the input vector using the trained regression model
        if( !pipeline.predict( inputVector ) ){
            cout << "ERROR: Failed to map test sample " << i << endl;
            return EXIT_FAILURE;
        }

        //Get the mapped regression data
        vector< double > outputVector = pipeline.getRegressionData();

        //Write the mapped value and also the target value to the file
        for(UINT j=0; j<outputVector.size(); j++){
            file << outputVector[j] << "\t";
        }
        for(UINT j=0; j<targetVector.size(); j++){
            file << targetVector[j] << "\t";
        }
        file << endl;
    }

    //Close the file
    file.close();

    return EXIT_SUCCESS;
}

Code & Resources

MLPRegressionExample.cpp MLPRegressionTrainingData.txt MLPRegressionTestData.txt

Documentation

You can find the documentation for this class at MLP documentation.