By Rebekah Esmaili
University of Maryland, College Park, U.S.
The lone genius and mad scientist are archetypes that define scientists in popular culture. In reality, scientists are international and interdisciplinary; we are at our best when synergizing individual talent. Easy access to data makes these collaborations possible. Earth science is well within the ÛÏfourth paradigmÛ of research: an era driven by data, not equipment1. However, there are still challenges to data availability.
I became an ÛÏopen dataÛ advocate after a young scientist training event in Brazil. I live in the U.S. but most attendees were from Central and South America. Over the first days, students were mostly spectators in the seminars, asking few questions. However, the tone dramatically changed during a data downloading tutorial. Students were abuzz with inquiries relevant to their research: ÛÏCan I download regional data instead of full resolution? What’s the difference between versions 6 and 7 of the dataset?Û
Reflecting back, I realized I had taken my data access and resources for granted. My research would be more challenging (impossible!) without my computing resources at NASA. Limiting data access silences Earth scientists around the world. While not the sole reason for promoting this cause, it is an overlooked one2. Improving dissemination of data and analysis tools can advance science as much as innovations in satellite technology.
Though an oversimplification, researchers can be grouped into core and periphery countries (Figure 1). Core countries (with greater than $150,000 of annual funding per researcher) have big research budgets which tower over the rest of the world. They take leadership of large remote sensing missions, like the Megha-Tropiques (France and India) satellite mission that is part of the Global Precipitation Mission (U.S., Japan, European Union, Brazil, and more) constellation. Shared access to data and hardware result from international collaborations on these projects and can also spill over into neighboring nations.
On the flip side, most nations are periphery to these collaborations, and cut off by low research and development expenditures (Figure 1). From available data, marginalized scientists (with less than $150,000 per researcher) make up 16 percent of the research population. Computing costs can be prohibitive for countries on the fringe (with less than $75,000 per researcher, 6 percent of the population). Periphery countries do not lack scientists. Rather, their projects are just underfunded, hindering data access, conference travel, project exposure, and collaboration opportunities. Do we truly have a good sampling of periphery research results or even what topics to investigate?
At the training event in Brazil, in addition to open data scientists advocated distribution of processing tool4,5. Satellite data are independently validated by periphery nations and they develop regional models to address local concerns. Much of their expenditures are associated with computing hardware. Being able to download smaller, regional subsets of spatial data would lower hardware requirements. Including tutorials, documentation, control files, or scripts to display data would prevent needless reinvention of the wheel. A unified Earth science repository could keep this information together and organized. Core research scientists should solicit help from periphery nations. Additionally, we should review research from periphery regions, not just articles in high-impact journals.
It’s not all doom and gloom: Scientists in core research countries do take up projects concerning regions that are financially disadvantaged, especially those troubled by natural disaster. Many scientists disseminate source data6, documentation7, interactive visualization, and scripts relevant to their research8. Not simply a grassroots effort, big budget research institutions are on board too. One progressive example is NASA’s Goddard Earth Sciences Data and Information Services Center. Users can download (although only in full resolution) or explore data using the Giovanni online visualization tool9. This is a great step forward.
In Brazil, I observed that all scientists want to make a difference but we often overlook assisting our own. In addition to core research countries, we must help periphery scientists lend their voices to global debates and examine issues from within rather than by outsiders. To make this possible in an unfair world, scientists on projects big and small must publish data and relevant tools. As satellites continue to be launched, we’ll have more data to work with. Giving everyone access can open up areas of research not yet envisioned and better optimize our response to flooding, droughts, and agricultural shifts in climate change. Our research community is like our beloved Earth: it is a system and by examining all parts we better understand the whole.
References Hey, Anthony J. G., Stewart Tansley, and Kristin Michele Tolle (2009). The Fourth Paradigm: Data-intensive Scientific Discovery. Microsoft Research. Redmond, WA.  McKee, Lance. 18 Reasons for Open Publication of Geoscience Data. Earthzine. https://earthzine.org/2010/08/04/18-reasons-for-open-publication-of-geoscience-data/.  UNESCO Institute for Statistics. Data Centre: Public Reports on Science, Technology, and Education. [Online]. Available:åÊ http://www.uis.unesco.org/  Dinku, TufaåÊ Use of Satellite Rainfall Estimates for Improving Climate Services in Africa. Presented at the 6th meeting of the International Precipitation Working Group, Sao Jose dos Campos, Brazil, October 2012.  de Coning, Estelle. Satellite Products for Weather Analysis and Nowcasting in Africa. Presented at the 6th meeting of the International Precipitation Working Group, Sao Jose dos Campos, Brazil, October 2012.  Dalia Kirschbaum Talks About Making a Global Landslide Inventory. NASA Earth Observatory. [Online]. Available:åÊ http://earthobservatory.nasa.gov/Features/Interviews/kirschbaum_20100623.php.  Huffman, George. Data and Products. CGMS-IPWG. 03 Sept. 2013. [Online]. Available: http://www.isac.cnr.it/~ipwg/data.html.  Guan, Bin. Bin Guan’s GrADS Script Library. Univ. of Maryland. [Online]. Available: http://www.atmos.umd.edu/~bguan/grads/GrADS_Scripts.htm  Goddard Earth Sciences Data and Information Center. NASA. [Online]. Available:åÊ http://daac.gsfc.nasa.gov.