OTB Users meeting 2017 brainstorming report
Contents
- 1 Refactoring of the ExtractROI application
- 2 How to fix the DownloadSRTM application
- 3 In sampling framework, how to compute features on the sampled points only ?
- 4 How could we chain applications in memory from command-line ?
- 5 Object-based image analysis in Orfeo ToolBox
- 6 Deep Learning in Orfeo ToolBox
- 7 What is missing for SAR imagery
- 8 Enhance use of time series as images stacks
Refactoring of the ExtractROI application
Regarding the ExtractROI application, it has been decided to extend the way of setting the region to extract. In addition to setting start index and size, or a reference image :
- Set coordinates in lat/lon
- Set coordinates in physical coordinates
- Use a reference vector file
- Give a point of interest and a radius
- xmin,ymin, xmax, ymax
Moreover, it has been stretched that the PixelValue application should also be extend to support lat/lon coordinates, as well as neighborhood capabilities. Last, the Convert application should be revamped to make its purpose (conversion to 8 bits images) more clear.
How to fix the DownloadSRTM application
Several issues should be adressed regarding this app :
- The application is currently not working anymore in download mode because the source website changed to https protocol, for which curl requires a certificate. A solution might exist to disable certificate checking (dangerous) or encode the certificate in the library.
- The application uses a very simple logic to determine which tiles are needed. As such, it is not able to distinguish between tiles that are missing and tiles that do not exist in SRTM. The use of a vector file indexing all tiles would solve this issue.
- Last, the app should allow the user to get information on SRTM coverage regarding a given image :
- Which tiles exist in SRTM and are not available in his local copy (for further download)
- What fraction of the image is actually covered by SRTM tiles
In sampling framework, how to compute features on the sampled points only ?
This request came up during the iota2 presentation on the first day and has been further discussed during the brainstorming session. Here are the main conclusions:
- It should be avoided to recode features as pixel functions and use those in the sampling application, because in this case features will always be limited to what the application offers (which will never exactly match what users want to do),
- A better option should be to rely on the pipeline streaming capability and in memory connection between applications.
- A low hanging fruit would be to avoid computing a whole tile but instead compute the minimal region that encloses all the points (or other geometric features) on the tile. Depending on how samples are distributed, this may be a good speed-up.
- Another one would be to skip tiles that are empty.
- We could apply this strategy to only compute one-pixel regions around samples, but this is probably not that efficient.
- A more reasonable strategy might be to simply reduce the tile size, resulting in more tiles being empty and thus skipped.
How could we chain applications in memory from command-line ?
During the brainstorming, an idea came up to allow for in-memory connection between applications from command-line: make the otbCommandLineApplicationLauncher (aka otbcli) able to run several applications, with a specific syntax for in-memory connection. Something like:
otbcli BandMath -in test.tif -exp "sin(im1b1)" Convert -in BandMath:out -out output.tif uint8
or
otbcli BandMath -in test.tif -exp "sin(im1b1)" Convert -in -1:out -out output.tif uint8
Object-based image analysis in Orfeo ToolBox
This discussion was quite long and is not easy to sum-up, but here are a few pointers:
- We need a structure that can represent a segmentation properly. Labeled raster and vector files are ok for dedicated applications or compatibility with other tools (such as Qgis), but we keep converting from one to another and neither is really adapted to reasoning on segments. This structure should allow to:
- Iterate / stream segments efficiently,
- Iterate pixels in segments efficiently,
- Walk through segments neighborhood graph,
- Read/write to disk efficiently,
- If possible, support multiscale segmentation.
Reading back, I think a multilayer vector file can meet almost all requirements appart from the pixel iteration, which should be easy enough to fix. The structure could be several layers of polygons (one layer per scale) and polygons could have specific fields to encode siblings (neighborhood), parents and children (scale up and down). It would be up to OTB to provide iterations through neighborhood and scale provided that the fields are available, but it could be done by using the existing code, and the OGR adapters could be fully reused.
- Our algorithms can use a generic metric but often only implement the euclidean distance, which is not suited for SAR imagery. It would be good to implement other metrics.
- In the end we may want to still classify pixel, but using attributes computed accross segments, neighborhood and scale.
Deep Learning in Orfeo ToolBox
We could easily include deep learning as a new machine learning model in OTB. This would require to pick one of the famous deep library, such as Torch, TensorFlow, Caffee or Singa). Caffee seems to be an interesting pick, since it is fully C++, and network arch can be set through a protobuf file [1] which could easily be passed as a parameter of the machine learning model. We would then be able to plug our sampling framework with deep learning. The machine learning model could perform a given number of iterations and write back a new protobuf file, which could be used again as input. This should be easy enough, but there are at least 3 issues that require further investigation:
- Deep Learning is about learning contextual features. If we give to deep learning isolated samples (i.e. pixels), it would probably not perform better than other machine learning algorithms. We could try to extend the Sampling application to allow to extract a neighborhood instead of isolated pixels.
- Deep Learning needs a lot of examples. The sampling framework currently loads all samples into memory, which limit the amount of sample that the machine learning can process. If Caffee can work iteratively, we can iterate on samples batch, but it will probably not give the same result as learning from all the samples at once.
- The hard part is to figure out the network arch. There are examples and already trained models, but starting a new one from scratch looks quite complex.
These issues require further testing before starting to work on the machine learning model itself.
For faster learning it would be interesting to pick a library that is natively compatible with GPU or HPC infra.
What is missing for SAR imagery
During the discussion, two main features were identified (appart from the segmentation metrics, see OBIA section) :
- Layover mask generation,
- Coherence images and interferometry.
The former is currently investigated by Maxime, a user that attended the user days.
We have some code from a former ESA/SOCIS project regarding the latter. This code has been updated for OTB 6.0 during the hackfest: https://github.com/jmichel-otb/otb-insar
Plan is to make it work with Sentinel1 data and make a remote module out of it.
Enhance use of time series as images stacks
We often represent time series as stack (aka. VectorImages) that mix dates and spectral bands. For now, the user has to keep track of which band in the stack correspond to which date and spectral band. And we do not have time-series operation (such as extracting by date range for instance). It would be great to be able to access the information related to each band within OTB, to be able to implement algorithms based on date.
In OTB we could search for specific metadata and expose them through proper methods in for instance:
$ gdal_edit.py -mo BAND_1_DATE=10.02.2012 test.tif $ gdalinfo test.tif Driver: GTiff/GeoTIFF Files: test.tif Size is 2000, 2000 Coordinate System is `' Origin = (5400.000000000000000,4300.000000000000000) Pixel Size = (1.000000000000000,1.000000000000000) Metadata: BAND_1_DATE=10.02.2012
Normalizing dates should be easy enough, but normalizing band names requires more thinking ...