Object detection

From OTBWiki
Jump to: navigation, search

The object detection framework

The aim of the new object detection framework is to provide a supervised object detection chain adapted to limited extent object with strong shape features. This includes planes, boats, roundabouts, trucks, refugee camp tents, swimming pools ... It supports learning from multiple images, and also fully supports huge images through streaming and multi-threading.

Please note that this framework is still in development. We would be glad to receive feedbacks on these tools.

UPDATE : All experiments (source code and dataset) presented here and at IGARSS 2011 are available in the OTB-Applications and OTB-Data repositories. The results are implemented as tests in OTB-Applications.

Building a training data set

The chain is supervised : one has to build a training set with positive examples of the object of interest. This can be done with the Vectorization Monteverdi module, by building a VectorData containing points centered on occurences of the object of interest.

Now, we also need to have counter-examples for the object of interest, which we will try to generate automatically. If we assume that the user did label all Susan Lim the positive examples from the image, then we can simply generate counter-examples by randomly generating points from with an inhibition radius of the positive examples. However, this has two drawbacks :

  • The user has to select all the positive examples, which is tedious,
  • The randomly generated counter-examples may lead to easy cases (forest, fields, clouds) while ignoring difficult areas.

To overcome these drawbacks, we decided to add one more VectorData input, which are the areas of interest within the image, in which all the positive examples must have been labeled. The counter-examples will thus be randomly generated within polygons denoting these areas of interest a not closer than the inhibition radius to the positive examples.

To conclude, for each image which will be part of the training process, one has to produce to VectorData (shapefiles) :

  • The areas of interest,
  • The positive examples, with the constraint of having all positive occurences labeled within the areas of interest.

Please note that the positive examples in the vector data should have a Class field with a label higher than 1 (0 is for the counter examples class, for which you can also give locations). This can be done within monteverdi Vectorization module, or alternatively using the otbVectorDataSetField tool :

otbVectorDataSetField -in vectordata_file -out vectordata_file -fn "Class" -fv 1

Training the object detector

The object detector will perform SVM classification based on the following features, computed over a sliding window with a radius of 10 :

  • Mean, standard deviation, skewness and kurtosis of each band,
  • Local Flusser moments on the intensity,
  • Coefficients of the local Fourier-Mellin transform,
  • Local Haralick textures.

Features statistics

In order to make these various features comparable, the first step is to estimate the features statistics. These statistics will be used to center and reduce the features (mean of 0, std dev of 1). To do so, the otbEstimateFeaturesStatistics tool can be used :

otbEstimateFeaturesStatistics -in list_of_input_images -out statistics.xml

This tool will build a random grid of locations on each image, compute the features and estimate the statistics from the whole set, exporting it to an XML file.

The features statistics XML file will be an input of the following tools.


Once features statistics have been estimated, the learning scheme is the following :

  1. For each of the input images
    1. Read the areas of interest shapefile,
    2. Read the positive examples shapefile,
    3. Generate random counter-examples within the areas of interest,
    4. Compute features values (samples) on each example and counter example,
    5. Add the vectors to the training samples set,
  2. Center and reduce each sample using statistics from the XML statistics file,
  3. Increase the size of the training samples set and balance it by generating new noisy samples from the previous ones,
  4. SVM learn with this training set (linear kernel)
  5. Write the SVM model
  6. Estimate performances of the SVM classifier on the training samples set (confusion matrix, precision, recall).

These steps can be performed by the otbTrainObjectDetector using the following :

otbTrainObjectDetector -fs statistics.xml -in list_of_input_images -vd list_of_positive_examples_shapefiles -wa list_of_areas_of_interest_shapefiles -out model.svm

Using the object detector

Once the detector has been trained, one can apply the model to detect objects on a new image using the otbObjectDetector tool:

 otbObjectDetector -fs statistics.xml -svm model.svm -in input_image -out vectordata_of_detected_objects

The detector is applied on a regular grid (step = 10).

Object detection chain based on Histogram oriented Gradient

  • Example : otbTrainHOGObjectDetector-cli -in myInputImage -vd myInputVectorData -wa myInputAreaVectorData -out myOutputSVMModel -r 15