Request for Comments-20: New sampling module for the classification framework

From OTBWiki
Jump to: navigation, search

Status

  • Author: Victor Poughon
  • Submitted on 2016/01/19
  • Open for comments

Content

What changes will be made and why would they make a better Orfeo ToolBox?

This RFC introduces a new OTB module that offers a framework for selecting and extracting samples to be used for classification models training. It is the continuation of work done by Paul Gely: https://github.com/PaulGely/AppSampling

The objective is to develop filters and applications that are modular and reusable, support different sampling strategies:

  • Exhaustive
  • Random
  • Periodic
  • Periodic with randomness
  • Stratified
  • (Possibly more)
MaskedIteratorDecorator

Decorate an existing iterator to the same behavior, but skip masked pixels. Developed for use in PolygonClassStatisticsFilter, but reusable.

PolygonClassStatisticsFilter

Input: Image metadata, shapefile, Mask (optional)

Output: Class Statistics (xml format)

This filter computes statistics over the labelled classes using a persistent filter. It does not need to load the image content, only its metadata. An optional input to this filter is a mask. The statistics are only computed where the mask is valid (!=0). This enables working with no-data or other masks.

SampleSelectionFilter

Input: Image metadata, class statistics, sampling strategy parameters, shapefile

Output: Sample list (OGR GDAL format)

SampleExtractionFilter

Input: Sample list, Image

Output: Samples (libSVM format, maybe OGR)

The SampleExtractionFilter could use an update mode on the input OGR sample list (adding the pixel value as a field).

Applications

New applications to be developed:

  • PolygonClassStatistics: Exposes PolygonClassStatisticsFilter
  • SampleSelection: Exposes SampleSelectionFilter
  • SampleExtraction: Exposes SampleExtractionFilter
  • ImageSampling: Exposes the complete sampling pipeline
  • ConvertSampleFile: Convert a sample file to another format (OGR, libSVM, CSV).

Existing classifiation applications:

  • Retrofit to use the new sampling module, keep the same user interface.
Perspectives

This architecture should support extensions for object based sampling and distributed computing.

When will those changes be available (target release or date)?

Target release is 5.4.

Who will be developing the proposed changes?

TBD

Community

Comments

Support

Corresponding Requests for Changes