Request for Changes-85: Add Unsupervised classification applications

From OTBWiki
Jump to: navigation, search

[Request for Changes - 85] Add Unsupervised classification applications

Status

  • Author: Ludovic Hussonnois
  • Submitted on 27.02.2017
  • Proposed target release 6.0
  • Adopted : +3 from Victor, Julien, Guillaume
  • Unsupervised Classification

Summary

Introduction of a new Unsupervised module. Use of new Unsupervised module in TrainVectorClassifier and TrainImageClassifier applications.

Rationale

This add a distinction between Supervised and Unsupervised algorithms since the training step is different from supervised because it do not need training data. Furthermore add a separate module for the unsupervised classification allow to reduce the amount of classification algorithms already presents in supervised module.

Implementation details

Classes and files

Some common headers have been moved into LearningBase module since it is used for both Supervised and Unsupervised modules.

A       Modules/Learning/LearningBase/include/otbMachineLearningModel.h
A       Modules/Learning/LearningBase/include/otbMachineLearningModel.txx
A       Modules/Learning/LearningBase/include/otbMachineLearningModelFactoryBase.h
A       Modules/Learning/LearningBase/include/otbSharkUtils.h

D       Modules/Learning/Supervised/include/otbMachineLearningModel.h
D       Modules/Learning/Supervised/include/otbMachineLearningModel.txx
D       Modules/Learning/Supervised/include/otbMachineLearningModelFactoryBase.h
D       Modules/Learning/Supervised/include/otbSharkUtils.h

Update otbMachineLearningModelFactory to use SharkKMeansMachineLearningModel.

M       Modules/Learning/Supervised/include/otbMachineLearningModelFactory.txx

Added a new otbSharkKMeansMachineLearningModel and Factory for clustering classification.

A       Modules/Learning/Unsupervised/include/otbSharkKMeansMachineLearningModel.h
A       Modules/Learning/Unsupervised/include/otbSharkKMeansMachineLearningModel.txx
A       Modules/Learning/Unsupervised/include/otbSharkKMeansMachineLearningModelFactory.h
A       Modules/Learning/Unsupervised/include/otbSharkKMeansMachineLearningModelFactory.txx

Correct the CanRead() function of the KNN model since it could read almost any other model file.

M       Modules/Learning/Supervised/include/otbKNearestNeighborsMachineLearningModel.txx

Correct the CanRead() function of the shark RF model see Additional information for more details.

M       Modules/Learning/Supervised/include/otbSharkRandomForestesMachineLearningModel.txx

Finally Update cmake

M       Modules/Learning/LearningBase/otb-module.cmake
M       Modules/Learning/Supervised/otb-module.cmake
A       Modules/Learning/Unsupervised/otb-module.cmake
A       Modules/Learning/Unsupervised/CMakeLists.txt


Applications

otbTrainVectorClassification and otbTrainImagesClassification applications are updated.

The behavior of those application stay the same. A vector data per image is needed.

TrainImages Changes

New parameters :

        -classifier.sharkkm.maxiter <int32>          Maximum number of iteration for the kmeans algorithm.  (mandatory, default value is 10)
        -classifier.sharkkm.k       <int32>          The number of class used for the kmeans algorithm.  (mandatory, default value is 2)
Diffs
M       Modules/Applications/AppClassification/app/otbTrainImagesClassifier.cxx
M       Modules/Applications/AppClassification/app/otbTrainVectorClassifier.cxx


otbLearningApplicationBase initialize unsupervised classifier and can provide classifier category (supervised/unsupervised).

M       Modules/Applications/AppClassification/include/otbLearningApplicationBase.h
M       Modules/Applications/AppClassification/include/otbLearningApplicationBase.txx

Base class have been added to provide specific initialization, connection and execution of composite applications.

A       Modules/Applications/AppClassification/include/otbTrainImagesBase.h
A       Modules/Applications/AppClassification/include/otbTrainImagesBase.txx
A       Modules/Applications/AppClassification/include/otbTrainVectorBase.h
A       Modules/Applications/AppClassification/include/otbTrainVectorBase.txx

Add a file to perform classification with shark k-means, it's necessary to provide a Train and Init function for the otbLearningApplicationBase.

A       Modules/Applications/AppClassification/include/otbTrainSharkKMeans.txx

Update cmake.

M       Modules/Applications/AppClassification/otb-module.cmake
M       Modules/Applications/AppClassification/app/CMakeLists.txt

Tests

Added new tests for shark KMeans clustering, same as supervised tests

A       Modules/Learning/Unsupervised/test/CMakeLists.txt
A       Modules/Learning/Unsupervised/test/otbMachineLearningClusteringModelCanRead.cxx
A       Modules/Learning/Unsupervised/test/otbTrainMachineLearningClusteringModel.cxx
A       Modules/Learning/Unsupervised/test/otbUnsupervisedTestDriver.cxx
A       Modules/Learning/Unsupervised/test/tests-shark.cmake
M       Modules/Applications/AppClassification/test/CMakeLists.txt

Additional notes

Shark model serialization compatibility

An issue is present with shark serialization in RelWithDebInfo build configuration on hulk. In fact a boost::archive::archive_exception is thrown if a model cannot be read (that is what we want) but on this specific configuration this exception can not be catch and a SegFault occurs. A workaround have been made to simply check the model name before deserialization, but this imply to break the direct compatibility with shark model (i.e. read a model generated by otb in shark code or read a model in otb write by a shark code.)