A Global Search Engine For Geospatial Data

NOAA BouyIf you’re a scientist or engineer cobbling together a geospatial project, say you’re trying to figure out how many people would be threatened by a tsunami in the Indian Ocean, a truism holds that you spend 80 percent of the time hunting down usable data. The data, when they exist at all, often are archived in incompatible formats, have varying degrees of accuracy and precision, and sometimes require a good deal of political savvy to find.

Yuri Gorokhovich is an assistant professor at the State University of New York at Purchase who has been investigating tsunami damage in Southeast Asia. Getting what he needed meant negotiating with the Indonesian government, agreeing to pay US $4500 for the required data, and identifying the one and only person who could authorize the transfer. Even then, in order to develop a model identifying how many people lived in the areas directly hit by the December 2004 tsunami, Gorokhovich had to secretly get classified government data smuggled over by foreign colleagues.

Making such work as simple as a Web search is the central objective of the Global Earth Observation System of Systems (GEOSS), an endeavor taking its first baby steps this summer. The system’s architects are compiling what is essentially a search engine for environmental data, including not just data from Earth-observing satellites but also terrestrial sensor data, population figures, and regional health and ecosystem information. Formatting and indexing the data, designing a portal, and creating a standards repository are the fundamental, if humdrum, components upon which hinge the lofty goals of GEOSS: to improve environmental models and forecasting.

“The seismic community, the solid earth guys, the weather folks, the climate folks, they all speak different languages,” says Jay Pearlman, a chief engineer at Boeing, in Seattle, and chair of the IEEE Committee on Earth Observations. Finding ways to enable those disparate communities to use the same data has been a mammoth task since GEOSS was conceived in 2003. According to Pearlman, a GEOSS portal and data clearinghouse are expected to launch by November, just ahead of a ministerial summit that month in Cape Town, which will bring together high-level delegates from all 70 contributing countries.

The implications are not just humanitarian and scientific, but commercial as well. Right now, a Google Earth mash-up can locate, say, all ice cream carts in Moscow; with GEOSS online, it may soon be possible to identify the places where the best golfing conditions will prevail five days from now. “This really is a quantum difference, not a matter of degree,” says George Percivall, the chief architect for the Open Geospatial Consortium, in Wayland, Mass.

Still, the main thrust of GEOSS is human-oriented science. Geospatial coordinates alone can vary tremendously depending on how scientists in disparate disciplines record the locations of observations, sometimes to the point of rendering the data unusable. “Around the world there’s hundreds, if not thousands, of ways people use to specify location,” says Siri Jodha Singh Khalsa, the IEEE Committee on Earth Observations’ vice chairman for standards. Location coordinates are made relative to a particular model of the Earth, for example, and different scientific communities use different models.

Although the more standardized observations from satellites help, taking data from space introduces other problems. The geographical coordinates for observations made on a moving platform are inherently less precise. Add to that the fact that many measurements are inferred from other properties, for instance, temperature data come from infrared readings, and it’s easy to see why space data are considered less reliable until they have been validated by overlapping observations made on Earth.
In the case of the Indonesian tsunami estimates, Gorokhovich had developed a model from satellite observations of how far inland damage had gone, but he needed to verify it. Eventually, he says, he “got lucky” and met someone who had mapped the locations of displaced refugees while traveling through the country.

Data from ground-based sensors, for their part, are less likely to be well indexed or to use standardized representations such as those based on XML, the mark-up language commonly used on the Internet. That makes it harder for researchers to locate and use the data.

If all relevant sources of Earth-based information could be logically connected and recorded in well-documented formats, life would be a lot easier for modelers.

Another goal of GEOSS’s architects is to persuade national governments to make more data freely available. Some countries restrict access to their space data more tightly than others, and the availability of any one measurement can vary from country to country. NASA has made elevation data for the United States available at 30-meter resolution, but data for the rest of the world, generated by the same satellite mission, is released only at 90-meter resolution.

In Europe, despite all the talk in Brussels of transparency, satellite data are even less freely available. According to Khalsa, the European Space Agency in Paris is very guarded with its satellite records, which it generally releases only to approved European Union researchers. “You have to go through special approval processes to get their data,” he says.

© 2007 IEEE. Reprinted with permission from IEEE Spectrum, Volume 44 Number 8.

Topic: ,