Linear Regression

Description

Linear Regression is a simple regression algorithm that can map an N-dimensional signal to a 1-dimensional signal.

The Linear Regression algorithm is a supervised learning algorithm that can be used for regression for any type of N-dimensional signal.

The Linear Regression algorithm is part of the GRT regression modules.

Advantages

The Linear Regression algorithm is a simple regression algorithm that can map an N-dimensional signal to a 1-dimensional signal. It works well if your data has a clear linear trend.

Disadvantages

The main limitation of the Linear Regression algorithm is that the mapping needs to be linear. Linear Regression can only map an N-dimensional signal to a 1-dimension signal. If you need a regression algorithm that can map an N-dimensional to an M-dimensional signal, or you need a regression algorithm that can perform non-linear mapping then you should try the MLP regression algorithm or the [GRT/MultidimensionalRegression | Multidimensional Regression]] algorithm instead.

Things To Know

You should always enable scaling with Linear Regression, as this will give you much better results.

Training Data Format

You should use the RegressionData data structure to train the Linear Regression algorithm.

Example Code

This examples demonstrates how to initialize, train, and use the Linear Regression algorithm for regression.

The example loads the data shown in the image below and uses this to train the Linear Regression algorithm. The data consists of the 3-axis gyro data from a Wii-mote, which has been labelled with a 1-dimensional target value (i.e. the value the Linear Regression model should output given the 3-dimensional gyro input). The purpose of this exercise is to see if the Linear Regression algorithm can learn to map the values of the Pitch axis of the gyro between 0 and 1, without the Roll and Yaw values corrupting the mapped output value. The Wii-mote was rotated left (around the Pitch axis) and then several seconds of training data was recorded with the target label set to 0. During this time the Wii-mote was moved around the Roll and Yaw axes, but not around the Pitch axis (as much as possible). The label was then changed to 1 and the Wii-mote was rotated right (around the Pitch axis) and then several seconds of training data was record, again the Wii-mote was moved around the Roll and Yaw axes, but not around the Pitch axis. This process was then repeat again to record the test dataset.

The images below show the recorded gyro and target values for both the training and test datasets. In the training and test images you can see the raw gyro data in the top row (with red = Roll, green = Pitch, blue = Yaw) and the target values in the bottom row.

You can download the training and test datasets in the Code & Resources section below.

Linear Regression Training Data
Gyro Training Data: The data consists of the 3-axis gyro data from a Wii-mote, which has been labelled with a 1-dimensional target value (i.e. the value the Linear Regression model should output given the 3-dimensional gyro input). The raw gyro data is in the top row (with red = Roll, green = Pitch, blue = Yaw) and the target value is in the bottom row. LinearRegressionTrainingDataImage1.jpg
LinearRegression Test Data
Gyro Test Data: The data consists of the 3-axis gyro data from a Wii-mote, which has been labelled with a 1-dimensional target value (i.e. the value the Linear Regression model should output given the 3-dimensional gyro input). The raw gyro data is in the top row (with red = Roll, green = Pitch, blue = Yaw) and the target value is in the bottom row. LinearRegressionTestDataImage1.jpg
Linear Regression Results Data
Linear Regression Results Data: This image show the output of the trained Linear Regression (in blue) along with the target value (in green) from the test data set. You can see that the Linear Regression effectively learns to map the 3-axis gyro input to the 1 dimensional target output, although the 'noise' from the Roll and Yaw axes still have some influence on the mapping of the Pitch axis. The RMS error of this test data was 0.006. LinearRegressionOutputResultsImage1.jpg
/*
 Logistic Regression Example
 This examples demonstrates how to initialize, train, and use the LinearRegression class for regression.

 Linear Regression is a simple regression algorithm that can map an N-dimensional signal to a 1-dimensional signal.

 In this example we create an instance of an LinearRegression algorithm and then use the algorithm to train a model using some pre-recorded training data.
 The trained model is then used to perform regression on the test data.

 This example shows you how to:
 - Create an initialize the LinearRegression algorithm for regression
 - Create a new instance of a GestureRecognitionPipeline and add the regression instance to the pipeline
 - Load some LabelledRegressionData from a file
 - Train a LinearRegression model using the training dataset
 - Test the LinearRegression model using the test dataset
 - Save the output of the LinearRegression algorithm to a file
*/


#include "GRT.h"
using namespace GRT;

int main (int argc, const char * argv[])
{
    //Turn on the training log so we can print the training status of the LinearRegression to the screen
    TrainingLog::enableLogging( true );

    //Load the training data and test data
    RegressionData trainingData;
    RegressionData testData;

    if( !trainingData.loadDatasetFromFile("LinearRegressionTrainingData.txt") ){
        cout << "ERROR: Failed to load training data!\n";
        return EXIT_FAILURE;
    }

    if( !testData.loadDatasetFromFile("LinearRegressionTestData.txt") ){
        cout << "ERROR: Failed to load test data!\n";
        return EXIT_FAILURE;
    }

    //Make sure the dimensionality of the training and test data matches
    if( trainingData.getNumInputDimensions() != testData.getNumInputDimensions() ){
        cout << "ERROR: The number of input dimensions in the training data (" << trainingData.getNumInputDimensions() << ")";
        cout << " does not match the number of input dimensions in the test data (" << testData.getNumInputDimensions() << ")\n";
        return EXIT_FAILURE;
    }

    if( testData.getNumTargetDimensions() != testData.getNumTargetDimensions() ){
        cout << "ERROR: The number of target dimensions in the training data (" << testData.getNumTargetDimensions() << ")";
        cout << " does not match the number of target dimensions in the test data (" << testData.getNumTargetDimensions() << ")\n";
        return EXIT_FAILURE;
    }

    cout << "Training and Test datasets loaded\n";

    //Print the stats of the datasets
    cout << "Training data stats:\n";
    trainingData.printStats();

    cout << "Test data stats:\n";
    testData.printStats();

    //Create a new gesture recognition pipeline
    GestureRecognitionPipeline pipeline;

    //Add a LinearRegression instance to the pipeline
    pipeline.setRegressifier( LinearRegression() );

    //Train the LinearRegression model
    cout << "Training LogisticRegression model...\n";
    if( !pipeline.train( trainingData ) ){
        cout << "ERROR: Failed to train LinearRegression model!\n";
        return EXIT_FAILURE;
    }

    cout << "Model trained.\n";

    //Test the model
    cout << "Testing LinearRegression model...\n";
    if( !pipeline.test( testData ) ){
        cout << "ERROR: Failed to test LinearRegression model!\n";
        return EXIT_FAILURE;
    }

    cout << "Test complete. Test RMS error: " << pipeline.getTestRMSError() << endl;

    //Run back over the test data again and output the results to a file
    fstream file;
    file.open("LinearRegressionResultsData.txt", fstream::out);

    for(UINT i=0; i<testData.getNumSamples(); i++){
        vector< double > inputVector = testData[i].getInputVector();
        vector< double > targetVector = testData[i].getTargetVector();

        //Map the input vector using the trained regression model
        if( !pipeline.predict( inputVector ) ){
            cout << "ERROR: Failed to map test sample " << i << endl;
            return EXIT_FAILURE;
        }

        //Get the mapped regression data
        vector< double > outputVector = pipeline.getRegressionData();

        //Write the mapped value and also the target value to the file
        for(UINT j=0; j<outputVector.size(); j++){
            file << outputVector[j] << "\t";
        }
        for(UINT j=0; j<targetVector.size(); j++){
            file << targetVector[j] << "\t";
        }
        file << endl;
    }

    //Close the file
    file.close();

    return EXIT_SUCCESS;
}

Code & Resources

LinearRegressionExample.cpp LinearRegressionTrainingData.txt LinearRegressionTestData.txt

Documentation

You can find the documentation for this class at Linear Regression documentation.