Request for changes-53: Connect applications in memory

From OTBWiki
Jump to: navigation, search

Status

Summary

These changes enables in memory connections between input and output image parameters of applications. Writing the output to disk and reading it back in the second application is therefore not necessary anymore.

Connections currently available:

  • 1 OutputImageParameter -> 1 InputImageParameter
  • 1 ComplexOutputImageParameter -> 1 ComplexInputImageParameter
  • N OutputImageParameter -> 1 InputImageListParameter

Rationale

Applications are often use as parts of larger processing chains. Chaining applications currently requires to write/read back images between applications, resulting in heavy I/O operations and a significant amount of time dedicated to writing temporary files. This request for changes allows to connect internal pipelines of applications together, so that ITK streaming capability is preserved throughout several applications chained together. Only the last application(s) of the chain is(are) responsible for writing final data.

Implementation details

Classes and files

In ApplicationEngine module
Handling (Complex)(Input/Output)ImageParameter connections

The capability for in memory connections was already available for InputImageParameter and OutputImageParameter. The only required changes for those were to expose to appropriate methods in Application class:

void SetParameterInputImage(std::string parameter, InputImageParameter::ImageBaseType * inputImage);
OutputImageParameter::ImageBaseType * GetParameterOutputImage(std::string parameter);

Regarding InputComplexImageParameter and OutputComplexImageParameter, situation was quite similar, but I also had to move the

template <class TComplexInputImage>
void
ComplexInputImageParameter::SetImage(TComplexInputImage* image)
{
  m_UseFilename = false;
  m_Image = image;
}

from compiled sources to header.

The following methods were added to the Application class:

void SetParameterComplexInputImage(std::string parameter, ComplexInputImageParameter::ImageBaseType * inputImage);
ComplexOutputImageParameter::ImageBaseType * GetParameterComplexOutputImage(std::string parameter);
Handling connections with InputImageListParameter

Connections between OutputImageParameter and InputImageListParameter was more tricky to write. Problem is that OutputImageParameter returns 'ImageBase<2> *' (because the actual image type varies depending on the application implementation). InputImageParameter accepts an 'ImageBase<2> *' parameter and does the casting to the type required by the application implementation through heavy template+macro magics (which explains why this class is so slow to build).

On the other hand, InputImageListParameter has a simpler implementation, as all images are FloatVectorImageType, and no casting is done to support other input types. This was fine as long as we were doing the reading of the image inside InputImageListParameter class, but to support in memory connection with input image types other than FloatVectorImageType, the class would require the same kind of template+macro magics, but with more complexity, since each image of the list might be of different type and casted by a different filter. I therefore did not implement this solution.

Instead, I rewrote the internal implementation of InputImageListParameter (while preserving public API), to act as a vector of InputImageParameter. This way, we can benefit from all the casting magic available in InputImageParameter (and also avoid lots of code duplication on the image reading part). Diff of this part of the RFC is available here for review:

https://git.orfeo-toolbox.org/otb.git/commitdiff/eb4e2c74aae9b7ff96fa1220c1855e7d25e1377d

As far as I tested, this is working like a charm.

In the future, this could allow further code factorization, like Widget generation code for instance.

In SWIG module

So as to enable in-memory connection through the Python API of the applications, I added the new methods to Application.i file.

Tests

otbApplicationMemoryConnectTest.cxx has been added to ApplicationEngine module tests. It tests connections including the one with InputImageList parameter.

PythonConnectApplications.py has been added to SWIG module tests. Same test as previous one, but in Python.

Both tests pass on today dashboard.

Documentation

New methods have been documented.

Cookbook documentation might need some additions.

Additional notes

  • Streaming between applications will only work if all applications chained together implement plain ITK pipelines. Nevertheless, chaining will work anyway (without streaming).
  • The Python API for in memory connection is not very pythonic, with respect to what Rashad did in RFC-12
  • To benefit from MPI capabilities introduced in RFC-26 while connecting applications in-memory, the user has to call otb::MPIConfig::Init() routine himself. Use of MPI with python API is currently not possible for the same reason. This is due to the fact that the call to otb::MPIConfig::Init() is currently done in ApplicationLauncherCommandLine. It might be worth adding an MPIInit method to the Application class.

Comments from review that could be used for a future RFC:

  • In the otb::WrapperInputImageList class : I like the use of std::vector<InputImageParameter::Pointer>, but ideally this list could be stored in the Children list of the parameter (see otb::Wrapper::Parameter::m_ChildrenList) which is a std::vector<Parameter::Pointer>. One possible benefit would be to access list member with the keys "il.1", "il.2" and so on. Maybe this could be part of a future RFC to use this parameter list to implement the "InputImageList", "StringList", "InputVectorDataList" ...
  • I also think that python wrapping could be enhanced. When we do "app.IN = value" we need a different behaviour if "value" is a string, a numpy array, or an ImageBase pointer. Like previous comment, this deserves and other RFC.