Open AccessData Descriptor
Technical Guidelines to Extract and Analyze VGI from Different Platforms
Data 2016, 1(3), 15; doi:10.3390/data1030015 (registering DOI) -
Abstract
An increasing number of Volunteered Geographic Information (VGI) and social media platforms have been continuously growing in size, which have provided massive georeferenced data in many forms including textual information, photographs, and geoinformation. These georeferenced data have either been actively contributed (e.g., [...] Read more.
An increasing number of Volunteered Geographic Information (VGI) and social media platforms have been continuously growing in size, which have provided massive georeferenced data in many forms including textual information, photographs, and geoinformation. These georeferenced data have either been actively contributed (e.g., adding data to OpenStreetMap (OSM) or Mapillary) or collected in a more passive fashion by enabling geolocation whilst using an online platform (e.g., Twitter, Instagram, or Flickr). The benefit of scraping and streaming these data in stand-alone applications is evident, however, it is difficult for many users to script and scrape the diverse types of these data. On 14 June 2016, a pre-conference workshop at the AGILE 2016 conference in Helsinki, Finland was held. The workshop was called “LINK-VGI: LINKing and analyzing VGI across different platforms”. The workshop provided an opportunity for interested researchers to share ideas and findings on cross-platform data contributions. One portion of the workshop was dedicated to a hands-on session. In this session, the basics of spatial data access through selected Application Programming Interfaces (APIs) and the extraction of summary statistics of the results were illustrated. This paper presents the content of the hands-on session including the scripts and guidelines for extracting VGI data. Researchers, planners, and interested end-users can benefit from this paper for developing their own application for any region of the world. Full article
Figures

Figure 1

Open AccessData Descriptor
688,112 Statistical Results: Content Mining Psychology Articles for Statistical Test Results
Data 2016, 1(3), 14; doi:10.3390/data1030014 -
Abstract
In this data deposit, I describe a dataset that is the result of content mining 167,318 published articles for statistical test results reported according to the standards prescribed by the American Psychological Association (APA). Articles published by the APA, Springer, Sage, and [...] Read more.
In this data deposit, I describe a dataset that is the result of content mining 167,318 published articles for statistical test results reported according to the standards prescribed by the American Psychological Association (APA). Articles published by the APA, Springer, Sage, and Taylor & Francis were included (mining from Wiley and Elsevier was actively blocked). As a result of this content mining, 688,112 results from 50,845 articles were extracted. In order to provide a comprehensive set of data, the statistical results are supplemented with metadata from the article they originate from. The dataset is provided in a comma separated file (CSV) in long-format. For each of the 688,112 results, 20 variables are included, of which seven are article metadata and 13 pertain to the individual statistical results (e.g., reported and recalculated p-value). A five-pronged approach was taken to generate the dataset: (i) collect journal lists; (ii) spider journal pages for articles; (iii) download articles; (iv) add article metadata; and (v) mine articles for statistical results. All materials, scripts, etc. are available at https://github.com/chartgerink/2016statcheck_data and preserved at http://dx.doi.org/10.5281/zenodo.59818. Full article
Figures

Figure 1

Open AccessData Descriptor
A New Integrated High-Latitude Thermal Laboratory for the Characterization of Land Surface Processes in Alaska’s Arctic and Boreal Regions
Data 2016, 1(2), 13; doi:10.3390/data1020013 -
Abstract
Alaska’s Arctic and boreal regions, largely dominated by tundra and boreal forest, are witnessing unprecedented changes in response to climate warming. However, the intensity of feedbacks between the hydrosphere and vegetation changes are not yet well quantified in Arctic regions. This lends [...] Read more.
Alaska’s Arctic and boreal regions, largely dominated by tundra and boreal forest, are witnessing unprecedented changes in response to climate warming. However, the intensity of feedbacks between the hydrosphere and vegetation changes are not yet well quantified in Arctic regions. This lends considerable uncertainty to the prediction of how much, how fast, and where Arctic and boreal hydrology and ecology will change. With a very sparse network of observations (meteorological, flux towers, etc.) in the Alaskan Arctic and boreal regions, remote sensing is the only technology capable of providing the necessary quantitative measurements of land–atmosphere exchanges of water and energy at regional scales in an economically feasible way. Over the last decades, the University of Alaska Fairbanks (UAF) has become the research hub for high-latitude research. UAF’s newly-established Hyperspectral Imaging Laboratory (HyLab) currently provides multiplatform data acquisition, processing, and analysis capabilities spanning microscale laboratory measurements to macroscale analysis of satellite imagery. The specific emphasis is on acquiring and processing satellite and airborne thermal imagery, one of the most important sources of input data in models for the derivation of surface energy fluxes. In this work, we present a synergistic modeling framework that combines multiplatform remote sensing data and calibration/validation (CAL/VAL) activities for the retrieval of land surface temperature (LST). The LST Arctic Dataset will contribute to ecological modeling efforts to help unravel seasonal and spatio-temporal variability in land surface processes and vegetation biophysical properties in Alaska’s Arctic and boreal regions. This dataset will be expanded to other Alaskan Arctic regions, and is expected to have more than 500 images spanning from 1984 to 2012. Full article
Figures

Figure 1

Open AccessData Descriptor
A Spectral Emissivity Library of Spoil Substrates
Data 2016, 1(2), 12; doi:10.3390/data1020012 -
Abstract
Post-mining sites have a significant impact on surrounding ecosystems. Afforestation can restore these ecosystems, but its success and speed depends on the properties of the excavated spoil substrates. Thermal infrared remote sensing brings advantages to the mapping and classification of spoil substrates, [...] Read more.
Post-mining sites have a significant impact on surrounding ecosystems. Afforestation can restore these ecosystems, but its success and speed depends on the properties of the excavated spoil substrates. Thermal infrared remote sensing brings advantages to the mapping and classification of spoil substrates, resulting in the determination of its properties. A library of spoil substrates containing spectral emissivity and chemical properties can facilitate remote sensing activities. This study presents spectral library of spoil substrates’ emissivities extracted from brown coal mining sites in the Czech Republic. Extracted samples were homogenized by drying and sieving. Spectral emissivity of each sample was determined by spectral smoothing algorithm applied to data measured by a Fourier transform infrared (FTIR) spectrometer. A set of chemical parameters (pH, conductivity, Na, K, Al, Fe, loss on ignition and polyphenol content) and toxicity were determined for each sample as well. The spectral library presented in this paper also offers valuable information in the form of geographical coordinates for the locations where samples were obtained. Presented data are unique in nature and can serve many remote sensing activities in longwave infrared electromagnetic spectrum. Full article
Figures

Figure 1

Open AccessArticle
Data Always Getting Bigger—A Scalable DOI Architecture for Big and Expanding Scientific Data
Data 2016, 1(2), 11; doi:10.3390/data1020011 -
Abstract
The Atmospheric Radiation Measurement (ARM) Data Archive established a data citation strategy based on Digital Object Identifiers (DOIs) for the ARM datasets in order to facilitate citing continuous and diverse ARM datasets in articles and other papers. This strategy eases the tracking [...] Read more.
The Atmospheric Radiation Measurement (ARM) Data Archive established a data citation strategy based on Digital Object Identifiers (DOIs) for the ARM datasets in order to facilitate citing continuous and diverse ARM datasets in articles and other papers. This strategy eases the tracking of data provided as supplements to articles and papers. Additionally, it allows future data users and the ARM Climate Research Facility to easily locate the exact data used in various articles. Traditionally, DOIs are assigned to individual digital objects (a report or a data table), but for ARM datasets, these DOIs are assigned to an ARM data product. This eliminates the need for creating DOIs for numerous components of the ARM data product, in turn making it easier for users to manage and cite the ARM data with fewer DOIs. In addition, the ARM data infrastructure team, with input from scientific users, developed a citation format and an online data citation generation tool for continuous data streams. This citation format includes DOIs along with additional details such as spatial and temporal information. Full article
Figures

Open AccessArticle
Permanent Stations for Calibration/Validation of Thermal Sensors over Spain
Data 2016, 1(2), 10; doi:10.3390/data1020010 -
Abstract
The Global Change Unit (GCU) at the University of Valencia has been involved in several calibration/validation (cal/val) activities carried out in dedicated field campaigns organized by ESA and other organisms. However, permanent stations are required in order to ensure a long-term and [...] Read more.
The Global Change Unit (GCU) at the University of Valencia has been involved in several calibration/validation (cal/val) activities carried out in dedicated field campaigns organized by ESA and other organisms. However, permanent stations are required in order to ensure a long-term and continuous calibration of on-orbit sensors. In the framework of the CEOS-Spain project, the GCU has managed the set-up and launch of experimental sites in Spain for the calibration of thermal infrared sensors and the validation of Land Surface Temperature (LST) products derived from those data. Currently, three sites have been identified and equipped: the agricultural area of Barrax (39.05 N, 2.1 W), the marshland area in the National Park of Doñana (36.99 N, 6.44 W), and the semi-arid area of the National Park of Cabo de Gata (36.83 N, 2.25 W). This work presents the performance of the permanent stations installed over the different test areas, as well as the cal/val results obtained for a number of Earth Observation sensors: SEVIRI, MODIS, and TIRS/Landsat-8. Full article
Figures

Open AccessData Descriptor
MODIS-Based Monthly LST Products over Amazonia under Different Cloud Mask Schemes
Data 2016, 1(2), 2; doi:10.3390/data1020002 -
Abstract
One of the major problems in the monitoring of tropical rainforests using satellite imagery is their persistent cloud coverage. The use of daily observations derived from high temporal resolution sensors, such as Moderate Resolution Imaging Spectroradiometer (MODIS), could potentially help to mitigate [...] Read more.
One of the major problems in the monitoring of tropical rainforests using satellite imagery is their persistent cloud coverage. The use of daily observations derived from high temporal resolution sensors, such as Moderate Resolution Imaging Spectroradiometer (MODIS), could potentially help to mitigate this issue, increasing the number of clear-sky observations. However, the cloud contamination effect should be removed from these results in order to provide a reliable description of these forests. In this study the available MODIS Land Surface Temperature (LST) products have been reprocessed over the Amazon Basin (10 N–20 S, 80 W–45 W) by introducing different cloud masking schemes. The monthly LST datasets can be used for the monitoring of thermal anomalies over the Amazon forests and the analysis of spatial patterns of warming events at higher spatial resolutions than other climatic datasets. Full article
Open AccessData Descriptor
A 1973–2008 Archive of Climate Surfaces for NW Maghreb
Data 2016, 1(2), 8; doi:10.3390/data1020008 -
Abstract
Climate archives are time series. They are used to assess temporal trends of a climate-dependent target variable, and to make climate atlases. A high-resolution gridded dataset with 1728 layers of monthly mean maximum, mean and mean minimum temperatures and precipitation for the [...] Read more.
Climate archives are time series. They are used to assess temporal trends of a climate-dependent target variable, and to make climate atlases. A high-resolution gridded dataset with 1728 layers of monthly mean maximum, mean and mean minimum temperatures and precipitation for the NW Maghreb (28°N–37.3°N, 12°W–12°E, ~1-km resolution) from 1973 through 2008 is presented. The surfaces were spatially interpolated by ANUSPLIN, a thin-plate smoothing spline technique approved by the World Meteorological Organization (WMO), from georeferenced climate records drawn from the Global Surface Summary of the Day (GSOD) and the Global Historical Climatology Network-Monthly (GHCN-Monthly version 3) products. Absolute errors for surface temperatures are approximately 0.5 °C for mean and mean minimum temperatures, and peak up to 1.76 °C for mean maximum temperatures in summer months. For precipitation, the mean absolute error ranged from 1.2 to 2.5 mm, but very low summer precipitation caused relative errors of up to 40% in July. The archive successfully captures climate variations associated with large to medium geographic gradients. This includes the main aridity gradient which increases in the S and SE, as well as its breaking points, marked by the Atlas mountain range. It also conveys topographic effects linked to kilometric relief mesoforms. Full article
Open AccessData Descriptor
The LAB-Net Soil Moisture Network: Application to Thermal Remote Sensing and Surface Energy Balance
Data 2016, 1(1), 6; doi:10.3390/data1010006 -
Abstract
A set of Essential Climate Variables (ECV) have been defined to be monitored by current and new remote sensing missions. The ECV retrieved at global scale need to be validated in order to provide reliable products to be used in remote sensing [...] Read more.
A set of Essential Climate Variables (ECV) have been defined to be monitored by current and new remote sensing missions. The ECV retrieved at global scale need to be validated in order to provide reliable products to be used in remote sensing applications. For this, test sites are required to use in calibration and validation of the remote sensing approaches in order to improve the ECV retrievals at global scale. The southern hemisphere presents scarce test sites for calibration and validation field campaigns that focus on soil moisture and land surface temperature retrievals. In Chile, remote sensing applications related to soil moisture estimates have increased during the last decades because of the drought and water use conflicts that generate a strong interest on improved water demand estimates. This work describes the Laboratory for Analysis of the Biosphere (LAB)—NETwork, called herein after ‘LAB-net’, which was designed to be the first network in Chile for remote sensing applications. The test sites were placed in four sites with different cover types: vineyards and olive orchards located in the semi-arid region of Atacama, an irrigated raspberry crop in the Mediterranean climate zone of Chimbarongo, and a rainfed pasture in the south of Chile. Over each site, well implemented meteorological and radiative flux instrumentation was installed and continuously recorded the following parameters: soil moisture and temperature at two ground levels (10 and 20 cm), air temperature and relative humidity, net radiation, global radiation, radiometric temperature (8–14 µm), rainfall and soil heat flux. The LAB-net data base post-processing procedure is also described here. As an application, surface remote sensing products such as soil moisture data derived from the Soil Moisture Ocean Salinity (SMOS) and Land Surface Temperature (LST) extracted from the MODIS-MOD11A1 and GOES LST from Copernicus products were compared to in situ data in Oromo LAB-net site. Moreover, land surface energy flux estimation is also shown as an application of LAB-net data base. These applications revealed a good performance between in situ and remote sensing data. LAB-net data base also contributes to provide suitable information for land surface energy budget and therefore water resources management at cultivars scale. The data based generated by LAB-net is freely available for any research or scientific purpose related to current and future remote sensing applications. Full article
Open AccessData Descriptor
A MODIS/ASTER Airborne Simulator (MASTER) Imagery for Urban Heat Island Research
Data 2016, 1(1), 7; doi:10.3390/data1010007 -
Abstract
Thermal imagery is widely used to quantify land surface temperatures to monitor the spatial extent and thermal intensity of the urban heat island (UHI) effect. Previous research has applied Landsat images, Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) images, Moderate Resolution [...] Read more.
Thermal imagery is widely used to quantify land surface temperatures to monitor the spatial extent and thermal intensity of the urban heat island (UHI) effect. Previous research has applied Landsat images, Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) images, Moderate Resolution Imaging Spectroradiometer (MODIS) images, and other coarse- to medium-resolution remotely sensed imagery to estimate surface temperature. These data are frequently correlated with vegetation, impervious surfaces, and temperature to quantify the drivers of the UHI effect. Because of the coarse- to medium-resolution of the thermal imagery, researchers are unable to correlate these temperature data with the more generally available high-resolution land cover classification, which are derived from high-resolution multispectral imagery. The development of advanced thermal sensors with very high-resolution thermal imagery such as the MODIS/ASTER airborne simulator (MASTER) has investigators quantifying the relationship between detailed land cover and land surface temperature. While this is an obvious next step, the published literature, i.e., the MASTER data, are often used to discriminate burned areas, assess fire severity, and classify urban land cover. Considerably less attention is given to use MASTER data in the UHI research. We demonstrate here that MASTER data in combination with high-resolution multispectral data has made it possible to monitor and model the relationship between temperature and detailed land cover such as building rooftops, residential street pavements, and parcel-based landscaping. Here, we report on data sources to conduct this type of UHI research and endeavor to intrigue researchers and scientists such that high-resolution airborne thermal imagery is used to further explore the UHI effect. Full article
Figures

Open AccessData Descriptor
Open-Access Geographic Data for the Argali Habitat in the Southeastern Tajik Pamirs
Data 2016, 1(1), 5; doi:10.3390/data1010005 -
Abstract
Seven Geographic Information System (GIS) layers comprise this dataset intended for understanding the Marco Polo argali habitat in the southeastern Tajikistan Pamirs (37°33′ N, 74°09′ E). Extensive remote sensing habitat data processing and field data analysis of the Marco Polo sheep study [...] Read more.
Seven Geographic Information System (GIS) layers comprise this dataset intended for understanding the Marco Polo argali habitat in the southeastern Tajikistan Pamirs (37°33′ N, 74°09′ E). Extensive remote sensing habitat data processing and field data analysis of the Marco Polo sheep study area have yielded these layers that are now available online to download and for use by other researchers interested in studying the argali patterns and habitat suitability in the southeastern Tajik Pamirs. It is important to note that the layers were generated using a 30-m Landsat ETM image and field data from 2012. Full article
Open AccessArticle
Open Access Article Processing Charges (OA APC) Longitudinal Study 2015 Preliminary Dataset
Data 2016, 1(1), 4; doi:10.3390/data1010004 -
Abstract
This article documents Open access article processing charges (OA APC) longitudinal study 2015 preliminary dataset available for download from the OA APC dataverse [1]. This dataset was gathered as part of Sustaining the Knowledge Commons (SKC), a research program funded [...] Read more.
This article documents Open access article processing charges (OA APC) longitudinal study 2015 preliminary dataset available for download from the OA APC dataverse [1]. This dataset was gathered as part of Sustaining the Knowledge Commons (SKC), a research program funded by Canada’s Social Sciences and Humanities Research Council. The overall goal of SKC is to advance our collective knowledge about how to transition scholarly publishing from a system dependent on subscriptions and purchase to one that is fully open access. The OA APC preliminary data 2015 Version 12 dataset was developed as one of the lines of research of SKC, a longitudinal study of the minority (about a third) of the fully open access journals that use this business model. The original idea was to gather data during an annual two-week census period. The volume of data and growth in this area makes this an impractical goal. For this reason, we are posting this preliminary dataset in case it might be helpful to others working in this area. Future data gathering and analyses will be conducted on an ongoing basis. We encourage others to share their data as well. In order to merge datasets, note that the two most critical elements for matching data and merging datasets are the journal title and ISSN. Full article
Open AccessData Descriptor
A Unified Cropland Layer at 250 m for Global Agriculture Monitoring
Data 2016, 1(1), 3; doi:10.3390/data1010003 -
Abstract
Accurate and timely information on the global cropland extent is critical for food security monitoring, water management and earth system modeling. Principally, it allows for analyzing satellite image time-series to assess the crop conditions and permits isolation of the agricultural component to [...] Read more.
Accurate and timely information on the global cropland extent is critical for food security monitoring, water management and earth system modeling. Principally, it allows for analyzing satellite image time-series to assess the crop conditions and permits isolation of the agricultural component to focus on food security and impacts of various climatic scenarios. However, despite its critical importance, accurate information on the spatial extent, cropland mapping with remote sensing imagery remains a major challenge. Following an exhaustive identification and collection of existing land cover maps, a multi-criteria analysis was designed at the country level to evaluate the fitness of a cropland map with regards to four dimensions: its timeliness, its legend, its resolution adequacy and its confidence level. As a result, a Unified Cropland Layer that combines the fittest products into a 250 m global cropland map was assembled. With an evaluated accuracy ranging from 82% to 95%, the Unified Cropland Layer successfully improved the accuracy compared to single global products. Full article
Open AccessEditorial
Journal Data: A New Platform for Data Research
Data 2016, 1(1), 9-10; doi:10.3390/data1010009 -
Abstract There is no doubt that data is of paramount importance to scientific progress. With the enormous development of science and technology, a huge amount of data fragments is produced every day. [...] Full article
Open AccessData Descriptor
Relational Data on Members of Portuguese Governments (1976–2014)
Data 2016, 1(1), 1-8; doi:10.3390/data1010001 -
Abstract
A data set containing information on the explicit connections concerning all members of Portuguese governments from 1976 until July 2013 is presented. This information was collected through a one-year research carried out by the authors using public records and official information (public [...] Read more.
A data set containing information on the explicit connections concerning all members of Portuguese governments from 1976 until July 2013 is presented. This information was collected through a one-year research carried out by the authors using public records and official information (public and private institutions). The data set was collected during the process of elaborating a book [1]. This database is the first open-access source of information on a specific type of community which enables a wide range of research in areas such as social and political sciences and economics. Full article