Earth sciences data is everywhere — but how do you put that data to work in a world that doesn’t know the boundaries between different scientific pursuits? That question has long been problematic for researchers in the Earth sciences. Many different Earth observation programs collect data, but collecting, sharing, and understanding that data in a multi-disciplinary way has been difficult, to say the least.
Understanding global environmental problems will take input from a myriad of sources, from social sciences to Earth science data.
ÛÏNature doesn’t realize that fire, and water and drought are different disciplines,Û said Dr. Jay Pearlman, an IEEE fellow. Now data gathered across disciplines is working harder than ever to erase boundaries, with online tools that make it easy to search and interpret multiple sources of information.
GEOSS, the Global Earth Observation System of Systems is working to combat the inherent difficulties with gathering and interpreting data. GEOSS keeps an eye on what’s going on all over the Earth through information gathered by tens of thousands of sensors and satellites, and is supported by nations and scientific organizations. It links together existing systems and provides support for new ones.
Once that multitude of Earth data is analyzed, it needs to be shared with others. But in Europe alone, 23 languages and different sets of semantics have meant that sharing was difficult. That’s why EuroGEOSS was born three years ago.
EuroGEOSS builds on the achievements of the Infrastructure for Spatial Information in Europe (INSPIRE), a European directive providing the legal framework, technical guidelines, and specifications for shared data infrastructures that deal with environmental issues. The legislation requires all 27 nations of the European Union to ÛÏensure that the spatial data infrastructures of the member states are compatible and usable in a community and transboundary context.Û
Dr. Stefano Nativi, a coordinator for the Earth and Space Science Informatics Laboratory at the National Research Council of Italy said that three areas of research were specially chosen because they didn’t heavily interact with each other — biodiversity, drought and forest. ÛÏWith those, we wanted to show the potential a brokered solution could have to develop synergies,Û said Nativi.
Each community in biodiversity, forestry and drought had created its own cyber infrastructure. ÛÏOur point was that we don’t have to push them towards any place technologically, but we have to ask them to implement all the technical artifacts they consider useful,Û said Nativi. When they had three different cyber infrastructures, they realized that there were some real differences — for example, historical and technological reasons entail the use of different data models shared through diverse Web-based protocols.
ÛÏSo our question was, OK, how do we interconnect these systems, making them clearly interoperable?Û asked Nativi. ÛÏThe point was, let’s make some glue to stick everything together.Û
The result was a 2009 project to create an online software platform called the EuroGEOSS Broker. There are several parts to the brokering platform. The first, the discovery broker, searches multiple shared infrastructures for specific terms, time, and geography. The semantic broker translates concepts across different disciplines. The access broker retrieves information and translates it into a common grid environment, matching up different systems of geographical coordinates. The beauty of brokers means that users and programmers can focus on what they do best, while the brokering layer takes care of the myriad of protocols and encodings.
Pearlman said that developing common language and common geographic descriptions was challenging. ÛÏWe found that the coordinate systems used to describe forests vary across Europe so overlay of data takes extra work. Even the use of words is not the same in different communities,Û The team was able to develop a translation between coordinate systems and implement access to ontologies that allow diverse research communities to understand each other and work more easily together.
Pearlman says that the EuroGEOSS broker is different than other search tools because it reduces the burdens on users and providers while facilitating interoperability and access to data. This allows science teams to spend more time on their research and less on manipulating infrastructure.
Max Craglia, EuroGEOSS technical lead, said that the brokering platform enabled new interdisciplinary research findings. For example, the analysis of the relationships between drought and forest fires in Spain shows that when researchers were able to access and interpret many sets of data, they could make a non-intuitive observation: Short-term droughts don’t have an impact on forest fires, but droughts lasting two years cause greater stresses on vegetation, which makes them fire prone. Likewise, the broker has helped understand biodiversity problems in Africa, which researchers can do without having to download data or models.
Users have flocked to the online tool. The broker was praised as the most significant innovation introduced in the GEOSS Common Infrastructure in 2011, making it possible to increase the number of data sets that are searchable from a few hundred in 2010 to more than 25 million in 2011.
Ben Domenico, outreach coordinator at Unidata Program Center in Colorado, said the brokering platform has helped his group to incorporate metadata into general data search without having to develop or implement a discovery system on their own. ÛÏWe can focus on our area of expertise — providing real-time datasets — and let the brokering layer connect us into data discovery systems so that users can find our data,Û said Domenico.
Moving forward, globally
Although the three-year term of EuroGEOSS ended on April 30, the tools live on and are sparking other global projects. The EuroGEOSS Broker became part of the GEOSS Common Infrastructure (GCI) and the National Research Council of Italy has committed to maintain it until the end of 2015. The broker is being used in a series of demonstrations with different communities like the United Nations Environmental Programme.
The broker is also being incorporated into two new projects, EarthCube and GEOWOW. EarthCube was launched in 2011 by the National Science Foundation to create a comprehensive cyberinfrastructure across the Geosciences. Earthcube is examining ways for multiple data systems to work together as one from a user perspective — not only in accessing data but also in providing models, forecasts, and possible scenarios. Likewise, ocean ecosystems, and water runoff are the topics in the European Union’s GEOWOW project that was launched in September 2011. The EuroGEOSS broker is the core of the first concept of brokering for complex (global and multidisciplinary) environments, said Nativi.
The team is now expanding the project’s range by integrating social media tools like Twitter and Facebook into data data search and access. There’s even a YouTube video about biodiversity and GEOSS that shows how the broker’s processing of information can get data in the hands of those who need it. ÛÏIf you get a result in Europe, colleagues in Africa can immediately replicate the results without having to download the data,Û Craglia said.
Nativi points to areas where the platform could be expanded further. ÛÏI really would like to see a brokering service layer that could be developed by cloud technology so that scientists running their own systems wouldn’t have to think at all about interoperability,Û he says, explaining that diverse researchers could plug their system into the cloud, and then the cloud can take over the role of interconnecting and translating.
Nativi says he has been surprised by the interest in sharing that has emerged in the past few years. ÛÏThe new approaches to sharing data are encouraging. The concept of working together is much more accepted than in past years,Û he says, adding that the research community also understands that sharing makes everyone’s science better. The natural world doesn’t recognize the boundaries of scientific disciplines, and now the data-rich research world is getting closer to erasing their differences as well.
Katharine Gammon is a freelance science writer living in Santa Monica, California. She has written about data-driven research, three-dimensional printing in ice, and elevators for fish.