Openframeworks Kinect Example

Description

This example shows you how to:

  • stream Kinect skeleton data into Openframeworks, using Synapse
  • setup a simple gesture recognition pipeline for recognizing basic gestures
  • record your own dataset and save it to a file
  • load the dataset back from a file
  • use the training dataset to train a classification model (using the ANBC algorithm)
  • use the trained model to predict the class of real-time data
Kinect Example
This shows an example of the application built using this tutorial.GRTKinectExampleImage.jpg

Setup

You will need to install both Openframeworks and Synapse to run this example. Synapse runs on both Mac and Windows machines (sorry Linux users, you can run this example but will need to write your own interface code to the OpenNI libraries). On OS X, users should use a Kinect XBOX 360 and not the Kinect for Windows versions of the Kinect. Windows users should be able to use either version of the Kinect, but will need to install an additional driver if they want to use the Kinect for Windows (see the instructions below).

OS X Install Instructions

  • Follow these instructions to install Openframeworks with XCode.
  • To install Synapse, simply download it here, plug in your Kinect XBOX 360 (don't forget to plug it in at the wall) and that should be it.

Windows Install Instructions

  • Follow these instructions to install Openframeworks with Visual Studio (there are also instructions here to use it with CodeBlocks).
  • If you are using the Kinect XBOX 360 then you need to install the OpenNI drivers, follow these instructions|.
  • If you want to use a Kinect for Windows, then you need to install the OpenNI drivers plus additional drivers for the Kinect for Windows. You can find these drivers here. Follow the instructions| to install the other OpenNI drivers.
  • To install Synapse, download it here, plug in your Kinect (don't forget to plug it in at the wall) and that should be it.

To compile this example:

  • use the Openframeworks project builder to create a new project.
  • when you have created the new project, override the default testApp.h, testApp.cpp, and main.cpp files with the files from this example. Also copy the SkeletonStreamer files into the src folder.
  • open the project in your favorite IDE (XCode, Visual Studio, Code Blocks, etc.) and add the main GRT source folder to the project. You can find the main GRT source folder by looking for the folder called GRT in the directory you downloaded from google code. Most IDE's let you just drag and drop the entire GRT code folder into your project.
  • note that some IDE's make you specify the location of the GRT source code folder (for example Visual Studio). To do this, open the project's properties or setting pane and add the path to the GRT folder to your project's cpp Include section. In XCode you can just drag and drop the GRT folder directly from finder into your project.
  • add the two additional files (SynapseStreamer.h and SynapseStreamer.cpp) to your project.
  • compile openframeworks
  • compile this project
  • copy the font file (verdana.ttf) into a folder called data which should be in the same directory as the program you get when you compile this project

When you have compiled this project, this is how you use it:

  • plug in your Kinect and launch Synapse (you should see a GUI window showing the depth map from the Kinect)
  • run the Openframeworks project you have just complied
  • when you start the project, you will have no training data and the classifier will not be trained so you need to do three things:
  1. record some training data
  2. train your pipeline
  3. use the pipeline to predict the class of real-time data

Skeleton Tracking You need to make sure that Synapse is tracking you, before you try to record or recognize any gestures. To get Synapse to track you, stand approximately 2-3 meters in front of the Kinect and raise your arms to make the 'PSI' calibration pose (you need to hold this pose for a few seconds before it will start to track you).

Kinect Calibration Pose
This shows an example of the PSI calibration pose you should make before Synapse can track your skeleton.psi-pose.jpg

If Synapse is tracking you, then you should see the graphs in the Openframeworks application change as you move your hands around. If the graphs do not change then press the q key to make sure that the Openframeworks application is connected with Synapse, also check the Synapse application to make sure it is still tracking you (if you get too close to the Kinect or leave the field of view then it will stop tracking you).

When the Kinect is tracking you, you are now ready to record your gestures and train the pipeline to recognize your gestures.

Step 1:

  • to record some training data, first make sure the value beside the TrainingClassLabel is set to the class you want to record the data for (i.e. 1 for gesture one, 2 for gesture two, etc.)
  • to change the training class label you can use the '[' and ']' keys, [ to decrease the label and ] to increase the label
  • make sure you are being tracked by Synapse (the graphs at the top of the Openframeworks application should move as you move your hands)
  • choose the first gesture you want to record, for example, gesture 1 could be 'holding both hands above your head'
  • press the 'r' key to start recording the training data for the current gesture. You will get 5 seconds after pressing the record key to move to the correct location and start performing your gesture. After 5 seconds, the application will start to record your gesture (you will see the count down time change from yellow to red). To get good training data, it is important that you start to make the gesture before the recording actually starts (otherwise you will record some data that is not a gesture). You should also move your hands around the extent of movement that you want your gesture to include. For example, if you are holding your hands above your head, then you should move your hands all around the area above your head to enable the classifier to learn that any posture like this corresponds to that gesture
  • after 5 seconds, the recording will automatically stop (it stops when the red count down timer disappears)
  • change the training class label to a new label and get ready to record your second gesture. For example, your second gesture might be 'hands down by your side'
  • press the 'r' key to start the recording, after 5 seconds the application will start to record your training examples for gesture 2
  • keep repeating these steps until you have recorded all the training data you want
  • if you make a mistake at any point, you can press the 'c' key to clear the current training data to allow you to record the training data again
  • when you have finished, press the 's' key to save the training data to a file
  • if you need to load the training data at a later stage, for instance when you next restart the program, press the 'l' key

Step 2:

  • after you have recorded your training data, you can now train your pipeline
  • to train your pipeline, press the 't' key
  • if the pipeline trained a classification model successfully then you will see the info message: Pipeline Trained, otherwise you will see the warning message WARNING: Failed to train pipeline. If the training failed, then make sure you have successfully recorded the training data

Step 3:

  • after you have trained the pipeline, you can now use the pipeline to predict the class of real-time data
  • if the pipeline was trained, it will automatically start to predict the class of real-time data
  • if things have worked OK, you should see that the pipeline estimates that you are performing gesture 1 when you lift both arms above your head, or gesture 2 when you hold your arms down by your side. Note that, depending on how 'big' you made your gestures when you were recording the training data, if you make a movement that is not a gesture then the pipeline should predict the special null gesture label of 0. For example, if you hold your arms out to the side, then you should see that the pipeline does not think you are doing any gesture. This is the special NULL GESTURE LABEL, which is output by the classifier when the likelihood of a gesture is too low. See this tutorial for more info: AutomaticGestureSpotting
  • if you find that your gestures are not being recognized very well, then you can four things to improve this:
  1. Increase the null rejection coefficient ( look for the line of code: anbc.setNullRejectionCoeff(5); and change 5 to a higher number, for instance 10)
  2. Record more training data for the gestures that are not being recognized well
  3. The classifier might not be able to recognize your gestures because the input data you are using (which in this example is the x, y, z coordinates of the left and right hands) does not support the types of gestures you are trying to recognize. For example, if you wanted to detect head-nods or head-shaking, then using the coordinates from the left and right hands is not very useful for the classifier, instead you might want to use the x, y, z values of the head or neck. You might also need to take the raw data and compute some more meaningful features from this data that help the classifier detect your gestures better. For example, if you wanted to recognize if the use was waving their hand then you might want to create some feature extraction that computes the amount of movement of each hand in the x, y, and z directions over a small time window (say the last second)
  4. Finally, another reason that your classifier is not working well is that you are using the wrong classifier for the task you are trying to solve. For instance, the ANBC algorithm used in this tutorial works very well at classifing static postures using the data from the Kinect, however, if you want to recognize temporal gestures (such as if the user just made a circle gesture) then other algorithms such as DTW might work much better for this

Example Code

testApp.h

 
#pragma once

#include "ofMain.h"
#include "SynapseStreamer.h"

//Inlcude the main openframeworks GRT header
#include "ofGRT.h"

//State that we are using the GRT namespace
using namespace GRT;

#define TIME_SERIES_GRAPH_WIDTH 500
#define DEFAULT_PREP_TIME 5000
#define DEFAULT_RECORD_TIME 5000

class testApp : public ofBaseApp{

public:
    void setup();
    void update();
    void draw();

    void keyPressed  (int key);
    void setupGraphs();

    SynapseStreamer synapseStreamer;
    vector< double > leftHand;
    vector< double > rightHand;

    TimeSeriesGraph leftHandGraph;
    TimeSeriesGraph rightHandGraph;

    string infoText;
    ofTrueTypeFont      font;

    GestureRecognitionPipeline pipeline;
    ClassificationData trainingData;
    TrainingDataRecordingTimer trainingTimer;
    bool trainingModeActive;
    bool predictionModeActive;
    UINT trainingClassLabel;

};

testApp.cpp

 
#include "testApp.h"

//--------------------------------------------------------------
void testApp::setup(){
    //Set the application frame rate to 30 FPS
    ofSetFrameRate( 30 );

    //old OF default is 96 - but this results in fonts looking larger than in other programs.
        ofTrueTypeFont::setGlobalDpi(72);

    //Load the font for the info messages
        font.loadFont("verdana.ttf", 18, true, true);
        font.setLineHeight(18.0f);
        font.setLetterSpacing(1.037);

    //Open the connection with Synapse
    synapseStreamer.openSynapseConnection();

    //Set which joints we want to track
    synapseStreamer.trackAllJoints(false);
    synapseStreamer.trackLeftHand(true);
    synapseStreamer.trackRightHand(true);
    synapseStreamer.computeHandDistFeature(true);

    //Setup the graphs for the input data
    setupGraphs();

    infoText = "";

    //Setup the training data
    trainingData.setNumDimensions( 6 );
    trainingModeActive = false;
    predictionModeActive = false;
    trainingClassLabel = 1;

    //Setup the classifier
    ANBC anbc;
    anbc.enableNullRejection(true);
    anbc.setNullRejectionCoeff(5);
    pipeline.setClassifier( anbc );

}

//--------------------------------------------------------------
void testApp::update(){

    //Parse any new messages from the synapse streamer
    synapseStreamer.parseIncomingMessages();

    if( synapseStreamer.getNewMessage() ){

        //Get the left hand and right hand joings
        leftHand = synapseStreamer.getLeftHandJointBody();
        rightHand = synapseStreamer.getRightHandJointBody();

        //Update the graphs
        leftHandGraph.update( leftHand );
        rightHandGraph.update( rightHand );

        vector< double > inputVector(6);
        inputVector[0] = leftHand[0];
        inputVector[1] = leftHand[1];
        inputVector[2] = leftHand[2];
        inputVector[3] = rightHand[0];
        inputVector[4] = rightHand[1];
        inputVector[5] = rightHand[2];

        if( trainingModeActive ){

            if( trainingTimer.getInRecordingMode() ){
                trainingData.addSample(trainingClassLabel, inputVector);
            }

            if( trainingTimer.getRecordingStopped() ){
                trainingModeActive = false;
            }
        }

        if( pipeline.getTrained() ){
            if( !pipeline.predict(inputVector) ){
                infoText = "Failed to make prediction";
            }
        }
    }

}

//--------------------------------------------------------------
void testApp::draw(){
    ofBackground(0,0,0);

    unsigned int x = 20;
    unsigned int y = 20;
    unsigned int graphWidth = TIME_SERIES_GRAPH_WIDTH;
    unsigned int graphHeight = 100;
    ofRectangle fontBox;
    string text;

    //Draw the timeseries graphs
    leftHandGraph.draw(x,y,graphWidth,graphHeight);    
    ofSetColor(255, 255, 255);
    text = "Left Hand";
    fontBox = font.getStringBoundingBox(text, 0, 0);
    font.drawString(text, x+(graphWidth/2)-(fontBox.width/2), y+10);
    y += graphHeight + 20;

    rightHandGraph.draw(x,y,graphWidth,graphHeight);
    ofSetColor(255, 255, 255);
    text = "Right Hand";
    fontBox = font.getStringBoundingBox(text, 0, 0);
    font.drawString(text, x+(graphWidth/2)-(fontBox.width/2), y+10);
    y += graphHeight + 20;


    int textX = 20;
    int textY = y;
    int textSpacer = 20;

    //Draw the training info
    ofSetColor(255, 255, 255);
    text = "------------------- TrainingInfo -------------------";
    font.drawString(text,textX,textY);

    textY += textSpacer;
    if( trainingModeActive ){
        if( trainingTimer.getInPrepMode() ){
            ofSetColor(255, 200, 0);
            text = "PrepTime: " + ofToString(trainingTimer.getSeconds());
        }
        if( trainingTimer.getInRecordingMode() ){
            ofSetColor(255, 0, 0);
            text = "RecordTime: " + ofToString(trainingTimer.getSeconds());
        }
    }else text = "Not Recording";
    font.drawString(text,textX,textY);

    ofSetColor(255, 255, 255);
    textY += textSpacer;
    text = "TrainingClassLabel: " + ofToString(trainingClassLabel);
    font.drawString(text,textX,textY);

    textY += textSpacer;
    text = "NumTrainingSamples: " + ofToString(trainingData.getNumSamples());
    font.drawString(text,textX,textY);


    //Draw the prediction info
    textY += textSpacer*2;
    text = "------------------- Prediction Info -------------------";
    font.drawString(text,textX,textY);

    textY += textSpacer;
    text =  pipeline.getTrained() ? "Model Trained: YES" : "Model Trained: NO";
    font.drawString(text,textX,textY);

    textY += textSpacer;
    text = "PredictedClassLabel: " + ofToString(pipeline.getPredictedClassLabel());
    font.drawString(text,textX,textY);

    textY += textSpacer;
    text = "Likelihood: " + ofToString(pipeline.getMaximumLikelihood());
    font.drawString(text,textX,textY);


    //Draw the info text
    textY += textSpacer*2;
    text = "InfoText: " + infoText;
    font.drawString(text,textX,textY);


    //Draw the prediction boxes
    double boxX = textX;
    double boxY = textY + (textSpacer*2);
    double boxSize = 50;
    vector< UINT > classLabels = pipeline.getClassLabels();
    for(unsigned int i=0; i<pipeline.getNumClasses(); i++){
        if( pipeline.getPredictedClassLabel() == classLabels[i] ){
            ofSetColor(255,255,0);
            ofFill();
            ofRect(boxX, boxY, boxSize, boxSize);
        }

        ofSetColor(255,255,255);
        ofNoFill();
        ofRect(boxX, boxY, boxSize, boxSize);
        boxX += boxSize + 10;
    }

}

//--------------------------------------------------------------
void testApp::keyPressed(int key){

    infoText = "";

    switch( key ){
        case 'q':
            synapseStreamer.openOutgoingConnection();
            break;
        case 'r':
            trainingModeActive = !trainingModeActive;
            if( trainingModeActive ){
                trainingTimer.startRecording(DEFAULT_PREP_TIME, DEFAULT_RECORD_TIME);
            }else trainingTimer.stopRecording();
            break;
        case '[':
            if( trainingClassLabel > 1 )
                trainingClassLabel--;
            break;
        case ']':
            trainingClassLabel++;
            break;
        case 't':
            if( pipeline.train( trainingData ) ){
                infoText = "Pipeline Trained";
            }else infoText = "WARNING: Failed to train pipeline";
            break;
        case 's':
            if( trainingData.saveDatasetToFile("TrainingData.txt") ){
                infoText = "Training data saved to file";
            }else infoText = "WARNING: Failed to save training data to file";
            break;
        case 'l':
            if( trainingData.loadDatasetFromFile("TrainingData.txt") ){
                infoText = "Training data saved to file";
            }else infoText = "WARNING: Failed to load training data from file";
            break;
        case 'c':
            trainingData.clear();
            infoText = "Training data cleared";
            break;
        default:
            printf("Key Pressed: %i\n",key);
            break;
    }

}

void testApp::setupGraphs(){

    vector< ofColor > axisColors(3);
    axisColors[0] = ofColor(255,0,0);
    axisColors[1] = ofColor(0,255,0);
    axisColors[2] = ofColor(0,0,255);

    leftHandGraph.init(TIME_SERIES_GRAPH_WIDTH, 3);
    leftHandGraph.backgroundColor = ofColor(0,0,0);
    leftHandGraph.gridColor = ofColor(200,200,200,100);
    leftHandGraph.drawGrid = true;
    leftHandGraph.drawInfoText = false;
    leftHandGraph.setRanges( vector<double>(3,0), vector<double>(3,1) );
    leftHandGraph.colors = axisColors;

    rightHandGraph.init(TIME_SERIES_GRAPH_WIDTH, 3);
    rightHandGraph.backgroundColor = ofColor(0,0,0);
    rightHandGraph.gridColor = ofColor(200,200,200,100);
    rightHandGraph.drawGrid = true;
    rightHandGraph.drawInfoText = false;
    rightHandGraph.setRanges( vector<double>(3,0), vector<double>(3,1) );
    rightHandGraph.colors = axisColors;
}

Code & Resources

Compiled Examples

You can download a pre-compiled version of the examples here, this also includes Synapse.

Source Code

You can also download all the source code to compile this project yourself. Read the setup instructions to learn how to setup and compile this project.