For more than 3 years, the classical scroll, full resolution and zoom visualisation tools have provided OTB users with a convenient way to display and navigate through their images. This visualisation paradigm is well adapted to a wide range of image processing tasks. Yet the recent convergence between remote sensing image processing and geographical information system within OTB yield new needs in terms of image navigation and visualisation.
Web-based geographical information browsing has provided us with a new way of navigation : the scroll or minimap navigation, within which it is often hard to spot region of interest precisely, is replaced by the ability to change seamlessly the scale (or resolution) at which data are displayed. Dragging and dropping also provide a more intuitive way of navigating to nearby regions.
On the other hand, lots of data format now enforce this multi-scale/multi-resolution representation : Jpeg2000,MrSid and ECW allow decoding at arbitrary resolution levels and gis services such as openstreetmap.org, WMS-T, Google Maps, Bing Maps etc. provide access to a resolution-wise quadtree of patches. Also format like tiff allow the storage of an image pyramid presenting several resolutions. It seems therefore interesting to provide to OTB users with the ability to exploit these multi-resolution schemes.
The aim of this page is to provide a white paper for the implementation of such an image navigation tool. This tool will not replace the current image viewer tool, but will rather provide another complementary tool. Please feel free to enrich this page with your ideas and features request.
Lots of imagery products also come with a preview image file. It would be very nice to use this preview when available instead of down-sampling our images every now and then.
To provide quick access, data are generally decomposed into a quadtree of tiles with a fixed size (often powers of 2). The image is split into 4 tiles at the lowest resolution, which in turn gets split into four tile with a 2 times more accurate resolution, and so on until the finest resolution (that means, the full resolution) is reached.
First, I think we should dissociate the data structure itself from the image navigation part. Some new kind of DataObject (ImagePyramid ? TileMap ?) should be created. Maybe the requested region concept could be enriched with some sort of scale notion. The viewer should be able to display arbitrary intermediate scales as well with a chosen resampling algorithm being applied to the closest possible power of 2 scale (ossim's imagelinker uses this approach).
Types of access this structure should support :
- Tile map from openstreetmap.org (vector)
- Tile maps from Nearmap and NASA WorldWind Server (raster)
- J2K, PciDSK, Erdas Imagine, MrSid and ECW images (GDAL and Ossim already have a concept of overview for these formats)
- In filesystem quadtree tiled images (refer to the Worldwind Dstile utility for a filesystem based quadtree)
- tiff files containing pyramid
- On the fly computation from standard image (with or without tile caching). As an example the GeoWebcache tile caching tool which ships with geoserver allows full tile seeding or live tile seeding - tiles are all precomputed at image load or tiles are computed in and saved/recycled as different parts of the image are viewed. A metatiling approach can also be used to precompute areas around which the user is viewing, adapting the views behaviour to the user's behaviour.
It appears that there are two separate issues:
- access and handling of multiresolution data (including processing);
- better navigation in such data sets.
Maybe a simple solution is to add a m_RequestedScale along with the m_RequestedRegion in otb::Image / otb::VectorImage. Filter that do not understand this parameter would pass it back, and filters able to understand it (such as ImageFileReader for instance) would take it into account. That requires to have a good definition of the Scale: the 20 levels that cover most possible scales in the case of osm won't be available in a tif file, they won't have the same meaning and the level parameter may vary in opposite direction.
It will be wise to let the DataAccess layer publish the multi-resolution details. And let the user choose/ apply some heuristics to build multi-resolution where none exists natively.
Still, this does not solve the problem of filtering -> Subsampling != Subsampling filtering for lots of filter. Users need to made aware that subsample -> filtered version is being used for display purposes only is not the final product. They should zoom 1:1 to view and appreciate the true filtered product. Filtering -> subsampling will require more processing in applying the filter to a larger number of pixels.