Open Source Geospatial Tools – Applications in Earth Observation

EarthzineReviews

A review of a book introducing command line processing for Earth observation data.

OpenSourceGeo1

“Open Source Geospatial Tools – Applications in Earth Observation” was written by Daniel McInerney and Pieter Kempeneers, and published in 2015 by Springer International Publishing Switzerland.

Have you thought of making use of command line software for geospatial data analysis, but so far the learning curve seemed too steep? Daniel McInerney and Pieter Kempeneers wrote the book “Open Source Geospatial Tools – Applications in Earth Observation” as introduction to powerful command line processing. Reality in the field of remote sensing is that rich sets of Earth observation data are available, which generally require processing and manipulation. The authors’ objective is to promote command line software for processing and manipulation of spatial data as this approach yet seems underused.
The book is an invitation to rethink processing routines in place for Earth observation data. As such, the intended readers of the book are remote sensing experts being new to command line processing or having already some experience with this technology. The wide readership requires the authors to provide an introduction to the basics of command line tools for vector, raster and Light Detection and Ranging (LiDAR) data processing. In addition, the book needs to cover a comprehensive set of processing functions such that it can be used as reference guide. Given that manuals of open source tools are spread over different websites, the authors provide a consolidated reference to a solid toolbox.
The toolbox the authors present to their readers covers recurring tasks in geospatial data analysis. The authors’ claim is not to cover all packages and functions for data processing. The focus rather is on introducing core functionality and ways to integrate tools from different packages for solving frequent tasks in processing Earth observation data.
All tools presented fulfil three guiding principles: they can be accessed via the command line, are free and open source and are available under Linux. These selection criteria make the authors’ objectives clear to promote a technology that is accessible to a broad user base. The authors provide the basis to join the community of batch processing and to learn how to efficiently analyze data without tools having graphical user interfaces. Keywords like repeatable processing chains and automation of tasks as well as free access point into the direction where the authors identify the strengths of command line tools.
Given the active community behind free and open source tools as a matter of fact the tools will evolve over time, which makes publishing an up-to-date book difficult. Therefore, the authors aim at supporting readers in learning the mechanics of command line processing as the integration of new or additional tools seems comparably simple once a basic understanding of how things work have been reached.
The book consists of three parts: (1) spatial data processing with the Geospatial Data Abstraction Library (GDAL) and the OGR Simple Features Library (OGR), (2) third party open source tools and (3) case studies. Part one introduces fundamentals of vector and raster data analysis together with tools for manipulating these data sets. GDAL is heavily used for data analysis tasks and the backbone of a series of open source tools. The OGR simple features library is part of the GDAL library focusing on vector data processing. Topics covered include indexing of images, tiling and pyramids, image reprojection, transformation between raster and vector data.
Part two complements the tools introduced for raster and vector processing with tools for image processing and for LiDAR data processing. The utilities presented are pktools and the Orfeo toolbox. Pktools overlap to a certain degree with GDAL/OGC functions, but provide more specialized functions for remote sensing applications. The Orfeo toolbox is designed for working with large remote sensing datasets and includes features for image correction, filtering and classification among others. In addition, part two provides an introduction to the development of an application programming interface for advanced users.
Part three presents case studies that show the practical application of the presented utilities. The case studies focus on remote sensing applications and land cover classification. An appendix with on overview on the installation of the software used throughout the book complements the presented material.
Descriptions of functions from the introduced packages provide substance to the book, but the book is more than a manual for using these functions. The book enriches the documentation through putting the function into an application context, providing graphical examples and code snippets that demonstrate parameter settings or results generated with the function. Looking at the example of the GDAL fill no data function, the authors even mention a current bug that may cause problems when using the function. The authors’ contribution is their experience with the tools and their ability to illustrate the practical usage of the tools.
Examples of functions are presented in the Bourne again shell (Bash) and some Python. The code snippets are clearly separated from the text in the well-arranged formatting followed throughout the book. The authors use a clear language and concise descriptions of the functions, which enhances the informative style of the book.
The strict focus on command line processing could be seen as a limitation. The book does not demonstrate how an analysis could be done using GDAL/OGR in GRASS GIS or the statistics tool R. The authors give reference to material discussing the application of the tools in other platforms; however, a short introduction on how to translate between platforms could be an asset. Also the mentioning of Web services as emerging developments seems a weakness, because the integration of data or tools from Web services may already be a standard use case in many applications nowadays.
In summary, the book provides the pieces of information that are needed to accompany the transition from novice to experienced users of command line geoprocessing. It provides a solid foundation for a toolbox supporting geospatial data analysis and shows links to additional tools and the community behind. Making full use of this book will require the reader to have it at their fingertips when solving problems and to practice command line processing actively.