Incorporating Python scripting to conduct geospatial analysis of flooding in the Mississippi River Basin

EarthzineDEVELOP Spring 2017 Article Session, DEVELOP Virtual Poster Session

This article is a part of the NASA DEVELOP’s Spring 2017 Article Session. For more articles like these, click here

This project developed an automated script that maps flooding in the Mississippi River Basin generating a preliminary analysis of populations, infrastructure, and agricultural fields impacted.

Authors:

Nicholas McVey

Mercedes Bartkovich

Helen Baldwin-Zook

Dashiell Cruz

Project partners at the U.S. Federal Emergency Management Agency (FEMA) and Geological Survey Hazards Data Distribution Systems (USGS HDDS) expressed a need for an algorithm that could map flooding in the Mississippi River Basin. In the early stages of the project, our team focused on developing that algorithm using Landsat 8 Operational Land Imager (OLI) images of flooded areas. Our science adviser and project partner Dr. Andrew Molthan, a research meteorologist with NASA Short-term Prediction Research and Transition Center (SPoRT), suggested a methodology that served as a template for the research. Based on his advice, the algorithm focused on identifying flood water appearing in a Landsat 8 OLI scene. Thus, the team was tasked with identifying the difference between flood water, expected water, and land.

The first challenge was how to distinguish water from land in a Landsat 8 OLI scene. We concluded the best way to tackle this issue was within the ESRI ArcMap program. There are several different indices, or mathematical operations using different bands of satellite imagery, that take advantage of the reflective nature of water, soil, and other subjects in order to identify them. The team set out to uncover which signatures could be used to train the computer. The team tested the Normalized Difference Vegetation Index (NDVI), an index calculated using Red and Near-Infrared wavelengths, the Normalized Difference Water Index (NDWI), which uses the Green and the Near-Infrared wavelengths, and several other combinations of the Near-Infrared band, Red band, and Short-Wave Infrared bands. While the NDVI is considered useful for identifying green vegetation in an area, NDWIs are designed to identify water features in remotely-sensed digital imagery. All band combinations were tested to determine which was the most appropriate for mapping water in the Mississippi River Basin.

With each of these indices and band combinations, we attempted to recognize water from imagery placed in the ESRI ArcMap program. Ultimately, we were only concerned with two possible distinctions: ‰ÛÏwater‰Û and ‰ÛÏnot water.‰Û To accomplish this feat, we first examined a Landsat 8 OLI scene of a flood event along the Mississippi River occurring in the states of Louisiana and Mississippi in December 2015 and a scene in the same location on a clear day when no flooding was expected. We created a signature file, a collection of reflectance values of the two variables, which would be used to classify an entire image. This signature file contained more than 150 samples of water taken in lakes, ponds, and sections of the Mississippi River and throughout the Basin. It also contained 130 additional samples of various components of what was delineated as ‰ÛÏland,‰Û which was anything that was not water. This sample included parking lots, marsh areas, agricultural fields, forested areas, and urban developments, such as buildings and houses. This signature file then served as a baseline. The maximum likelihood classification examined each pixel against this signature file and decided whether that pixel was water or land, resulting in a water extent map.

To determine the accuracy of each index, we conducted an error matrix analysis on the four different classified maps created by the band combinations. First, we generated 150 random points across our study area and identified the ground truth at each one. Then we overlaid each of the classified band images on this map of ground truths to examine which combinations accurately identified water. In a binary assessment, the classification either identified water correctly or it did not identify water where it should have. This error matrix was replicated three times, for a total of 600 randomly generated points across the study area. Ultimately, it was concluded that the NDWI calculation, with an overall accuracy of 95 percent, was the most accurate classification at detecting water in the satellite imagery.

The next challenge was to move from the map of total water to a map of only flood-induced water, meaning we needed to mask the water that was expected to be there, such as a river or lake at normal height. To accomplish this, we used the NASA Global Land Cover Facility and the University of Maryland’s Global Water Mask product derived from the NASA Terra – MODIS satellite instrument. This product is a 250 m resolution raster dataset that maps prevailing water features across the United States by taking daily images of the surface. We subtracted the MODIS water mask from our maximum likelihood classification, which created a map that only depicted areas of water where it was not expected to be. The areas where water was not expected to be were concluded to be flood-induced water and were depicted in a map that we called a Flood Extent Map.

In addition to mapping flood water, we wanted to assess the impact of the flood event. To do this, we intersected 2011 National Landcover Database (NLCD) data, Homeland Infrastructure Foundation-Level Data (HIFLD), and 2014 LandScan data from Oak Ridge National Laboratory (ORNL) with areas identified as flood water. The Landcover dataset provided information on agriculture that was likely damaged by flooding. The HIFLD dataset displayed locations of key infrastructure, such as schools and hospitals. The LandScan dataset from ORNL provided population data, which allowed for a quick analysis of the number of people impacted by a flood event. These three datasets produced a basic Impact Map that quickly measured the influence of a flood event on the surrounding area.

Figure 1. Study Area: Mississippi River Basin. We created an algorithm to identify flooding in the Mississippi River Basin. We tested the script on 11 flood events that occurred between August 2013 and March 2017. Image Credit: NASA DEVELOP; Service Layer Credit: Esri, USGS, NOAA

Having surmounted the challenges of creating an algorithm for mapping flood water and analyzing its impact, we were tasked with automating it. We used Python 2 scripting software to develop the code that would conduct the detection and analysis process automatically. The script imports functions from Arcpy, a package designed to facilitate analysis of geographic data, data conversion, data management, and map automation specifically for Python. After writing a script that followed each step of our flood mapping algorithm, we tested it on the September 2016 flood event that occurred in Cedar Rapids, Iowa, followed by an additional 10 flood events that occurred throughout the Mississippi River Basin (Figure 1).

Figure 2. Flood Extent Map. Zoomed in Northwest corner of the Extent Map produced by our flood mapping script. Image Credit: NASA DEVELOP; Service Layer Credit: Esri, USGS, NOAA

After downloading two Landsat 8 OLI tiles that show Iowa shortly after the peak flood stage, we ran our script on the imagery generating our Flood Extent Map (Figure 2) and our Flood Impact Map (Figure 3). According to our analysis, more than 43 square miles of the area flooded within the two tiles, including more than 4,700 acres of agriculture and impacting close to 1,000 people. Within just the zoomed-in area of the imagery, about 200 people were impacted by the flood waters and more than 2,200 acres of agricultural fields were detected to have been flooded.

Figure 3. Flood Impact Map. Zoomed in Northwest corner of the Impact Map produced by our flood mapping. Image Credit: NASA DEVELOP; Service Layer Credit: Esri, USGS, NOAA

Following the conclusion of the Iowa flood analysis, we ran the automated flood probability algorithm on 10 flood events that occurred from August 2013 to March 2017 throughout the Mississippi River Basin, all falling within five of the six physiographic regions within the basin. In total, according to our analysis, these floods impacted about 20,000 people, more than 74,000 acres of agriculture, and five major infrastructures. The script, Flood Extent Maps, and Flood Impact Maps created will help our project partners and other disaster response organizations identify areas of probable flooding within a given area, aiding in their decision-making process in regards to flood monitoring, relief efforts, and damage assessments.

Author Biographies

Nicholas McVey is a recent graduate from Embry-Riddle Aeronautical University and current student at the University of Alabama in Huntsville working with DEVELOP at Marshall Space Flight Center as an independent research consultant on the Mississippi River Basin Disasters project. You can contact McVey atåÊnam0014@uah.edu.

Mercedes Bartkovich graduated from the University of Georgia with a bachelor’s in forest resources and from Alabama A&M University with a master’s in biological and environmental science. She is working with DEVELOP at Marshall Space Flight Center as an independent research consultant on the Mississippi River Basin Disasters project. You can contact Bartkovich atåÊbartkovichm@gmail.com.

Helen Baldwin-Zook is a recent graduate from Middlebury College and current student at the University of Alabama in Huntsville working with DEVELOP at Marshall Space Flight Center as an independent research consultant on the Mississippi River Basin Disasters project. You can contact Baldwin-Zook atåÊhelenbluebaldwin@gmail.com.

Dashiell Cruz is a recent graduate from the University of Alabama in Huntsville where he majored in political science. He is working with DEVELOP at Marshall Space Flight Center as an Impact Analysis Fellow, an assistant center lead, and an independent research consultant on the Mississippi River Basin Disasters project. You can contact Cruz atåÊcruzd@uah.edu.