Request for Comments-34: Faster BandMathX

From OTBWiki
Jump to: navigation, search

[Request for Comments - 34] Faster BandMathX


  • Author: Guillaume Pasero
  • Submitted on 01.07.2016
  • Open for comments


What changes will be made and why they would make a better Orfeo ToolBox?

This is a review of different options to make the BandMathX application faster. At the moment, it is terribly slow compared to BandMath. Even if the author of MuParser and MuParserX has shown that those 2 parsers have different performance (see [1]). This is a problem as a lot of users could use the BandMathX for fast prototyping. Improving the performance of the otb::BandMathXImageFilter would make it usable on images larger than 256x256... In MuParserX API, there is no method to do evalutations on a whole buffer. The HowTo for this API is rather simple :

  • create a parser
  • define variables, functions, and constants,
  • set the expression
  • then call Eval() each time the variables are modified.

Let's see the ThreadedGenerateData of otb::BandMathXImageFilter. With a few simplifications, here is the global structure :

Source Code

<syntaxhighlight lang="cpp">

 ValueType value;
 unsigned int nbInputImages = this->GetNumberOfInputs();
 //----------------- --------- -----------------//
 //----------------- Iterators -----------------//
 //----------------- --------- -----------------//
 typedef itk::ImageRegionConstIterator<TImage> ImageRegionConstIteratorType;
 std::vector< ImageRegionConstIteratorType > Vit;
 for(unsigned int j=0; j < nbInputImages; ++j)
   Vit[j] = ImageRegionConstIteratorType (this->GetNthInput(j), outputRegionForThread);  
 std::vector< ImageRegionConstIteratorType > VoutIt;
 for(unsigned int j=0; j < VoutIt.size(); ++j)
   VoutIt[j] = ImageRegionConstIteratorType (this->GetOutput(j), outputRegionForThread); 
 //Special case : neighborhoods
 std::vector< itk::ConstNeighborhoodIterator<TImage> > VNit;
 for(unsigned int j=0; j<m_VVarName.size(); ++j)
   // [...]
 // Support progress methods/callbacks
 itk::ProgressReporter progress(this, threadId, outputRegionForThread.GetNumberOfPixels());
 for(unsigned int j=0; j < nbInputImages; ++j)       {  Vit[j].GoToBegin();     }
 for(unsigned int j=0; j < m_Expression.size(); ++j) {  VoutIt[j].GoToBegin();  }
 for(unsigned int j=0; j < VNit.size(); ++j)         {  VNit[j].GoToBegin();    }
 //--------------------- --------------- -------------------//
 //--------------------- Start main loop -------------------//
 //--------------------- --------------- -------------------//
 while(!Vit[0].IsAtEnd()) // For each pixel
   //----------------- --------------------- -----------------//
   //----------------- Variable affectations -----------------//
   //----------------- --------------------- -----------------//
   int ngbhNameIndex=0; int index;
   for(unsigned int j=0; j < m_AImage[threadId].size(); ++j) // For each variable, perform a copy
     switch (m_AImage[threadId][j].type)
       case 0 : //idxX
         m_AImage[threadId][j].value = static_cast<double>(Vit[0].GetIndex()[0]);
       case 1 : //idxY
         m_AImage[threadId][j].value = static_cast<double>(Vit[0].GetIndex()[1]);
       case 2 : //Spacing X (imiPhyX)
         //Nothing to do (already set inside BeforeThreadedGenerateData)"
       case 3 : //Spacing Y (imiPhyY)
         //Nothing to do (already set inside BeforeThreadedGenerateData)"
       case 4 : //vector
         // m_AImage[threadId][j].info[0] : Input image #ID
         for(int p=0; p < m_AImage[threadId][j].value.GetCols(); ++p)
           m_AImage[threadId][j].value.At(0,p) = Vit[m_AImage[threadId][j].info[0]].Get()[p];
       case 5 : //pixel
         // m_AImage[threadId][j].info[0] : Input image #ID
         // m_AImage[threadId][j].info[1] : Band #ID
         m_AImage[threadId][j].value = Vit[m_AImage[threadId][j].info[0]].Get()[m_AImage[threadId][j].info[1]];
       case 6 : //neighborhood    
       // m_AImage[threadId][j].info[1] : Band #ID
       // [...]
       // [...]
       default :
         itkExceptionMacro(<< "Type of the variable is unknown");
   //----------------- ----------- -----------------//
   //----------------- Evaluations -----------------//
   //----------------- ----------- -----------------//
   for(unsigned int IDExpression=0; IDExpression<m_Expression.size(); ++IDExpression)
     value = m_VParser[threadId]->Eval();
     // Depending on output type ...
     switch (value.GetType())
       {   //ValueType
       case 'i':
         VoutIt[IDExpression].Get()[0] = value.GetInteger();
       case 'f':
         VoutIt[IDExpression].Get()[0] = value.GetFloat();
       case 'c':
         itkExceptionMacro(<< "Complex numbers are not supported." << std::endl);
       case 'm':
         mup::matrix_type vect = value.GetArray();
         if ( vect.GetRows() == 1 ) //Vector
           for(int p=0; p<vect.GetCols(); ++p)
             VoutIt[IDExpression].Get()[p] = vect.At(0,p).GetFloat();
         else //Matrix
           itkExceptionMacro(<< "Result of the evaluation can't be a matrix." << std::endl);

       //----------------- Pixel affectations -----------------//
       for(unsigned int p=0; p<VoutIt[IDExpression].Get().GetSize(); ++p)
           // Case value is equal to -inf or inferior to the minimum value
           // allowed by the PixelValueType cast
           // [...]
           // Case value is equal to inf or superior to the maximum value
           // allowed by the PixelValueType cast
           // [...]
   for(unsigned int j=0; j < nbInputImages; ++j)        {   ++Vit[j];    }
   for(unsigned int j=0; j < m_Expression.size(); ++j)  {   ++VoutIt[j]; }
   for(unsigned int j=0; j < VNit.size(); ++j)          {   ++VNit[j];   }
 } //End while


I have spotted 4 items that could bring better performances :

1 - Only 1 call to SetExpr() per parser

As you can see ([2]), the SetExpr() is called on each parser for each PIXEL and each output expression. The expression should be set only once on each parser before the GenerateData(). This method triggers a parsing of the expression so it is not a simple member affectation.

2 - Avoid copy into variables

See [3]. For each pixel position, the data has to be copied from ITK iterators to the variables used by MuParser ( m_AImage ). If we could define a new variable class IteratorVariable, deriving mp::Variable, that would wrap an ITK iterator, there would be no need of this copy. The evaluation of the IteratorVariable would directly call the corresponding Get() of the wrapped iterator.

3 - Only define necessary variables and functions already implemented

The parsers are initialized with the full list of known functions and variables. If we could only define those that are used in the expression, we could ease the work for mp::Parser::Eval(). I assume that the more functions and variables are defined, the longer it takes to perform the evaluation (to be checked in MuParserX sources).

4 - Assume a constant returned type per expression

See [4]. For each pixel position, there is switch/case on the value type returned by the parser. If we assume a constant returned type per expression, we can put this switch/case outside the main pixel iteration loop.

When will those changes be available (target release or date)?

No immediate availability, as this filter is quite complex.

Who will be developing the proposed changes?

Anyone willing to do it.


==== Comments ==== .


Corresponding Requests for Changes