Previous Issue

Table of Contents

Data, Volume 4, Issue 3 (September 2019)

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Readerexternal link to open them.
View options order results:
result details:
Displaying articles 1-38
Export citation of selected articles as:
Open AccessData Descriptor
NILMPEds: A Performance Evaluation Dataset for Event Detection Algorithms in Non-Intrusive Load Monitoring
Data 2019, 4(3), 127; https://doi.org/10.3390/data4030127 (registering DOI)
Received: 25 June 2019 / Revised: 27 July 2019 / Accepted: 21 August 2019 / Published: 24 August 2019
PDF Full-text (382 KB) | HTML Full-text | XML Full-text
Abstract
Datasets are important for researchers to build models and test how these perform, as well as to reproduce research experiments from others. This data paper presents the NILM Performance Evaluation dataset (NILMPEds), which is aimed primarily at research reproducibility in the field of [...] Read more.
Datasets are important for researchers to build models and test how these perform, as well as to reproduce research experiments from others. This data paper presents the NILM Performance Evaluation dataset (NILMPEds), which is aimed primarily at research reproducibility in the field of Non-intrusive load monitoring. This initial release of NILMPEds is dedicated to event detection algorithms and is comprised of ground-truth data for four test datasets, the specification of 47,950 event detection models, the power events returned by each model in the four test datasets, and the performance of each individual model according to 31 performance metrics. Full article
Figures

Figure 1

Open AccessArticle
A Novel Ensemble Neuro-Fuzzy Model for Financial Time Series Forecasting
Data 2019, 4(3), 126; https://doi.org/10.3390/data4030126 (registering DOI)
Received: 30 June 2019 / Revised: 15 August 2019 / Accepted: 20 August 2019 / Published: 23 August 2019
Viewed by 108 | PDF Full-text (1150 KB)
Abstract
Neuro-fuzzy models have a proven record of successful application in finance. Forecasting future values is a crucial element of successful decision making in trading. In this paper, a novel ensemble neuro-fuzzy model is proposed to overcome limitations and improve the previously successfully applied [...] Read more.
Neuro-fuzzy models have a proven record of successful application in finance. Forecasting future values is a crucial element of successful decision making in trading. In this paper, a novel ensemble neuro-fuzzy model is proposed to overcome limitations and improve the previously successfully applied a five-layer multidimensional Gaussian neuro-fuzzy model and its learning. The proposed solution allows skipping the error-prone hyperparameters selection process and shows better accuracy results in real life financial data. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Open AccessData Descriptor
Google Web and Image Search Visibility Data for Online Store
Data 2019, 4(3), 125; https://doi.org/10.3390/data4030125 (registering DOI)
Received: 17 July 2019 / Revised: 5 August 2019 / Accepted: 15 August 2019 / Published: 22 August 2019
Viewed by 219 | PDF Full-text (365 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
This data descriptor describes Google search engine visibility data. The visibility of a domain name in a search engine comes from search engine optimization and can be evaluated based on four data metrics and five data dimensions. The data metrics are the following: [...] Read more.
This data descriptor describes Google search engine visibility data. The visibility of a domain name in a search engine comes from search engine optimization and can be evaluated based on four data metrics and five data dimensions. The data metrics are the following: Clicks volume (1), impressions volume (2), click-through ratio (3), and ranking position (4). Data dimensions are as follows: queries that are entered into search engines that trigger results with the researched domain name (1), page URLs from research domains which are available in the search engine results page (2), country of origin of search engine visitors (3), type of device used for the search (4), and date of the search (5). Search engine visibility data were obtained from the Google search console for the international online store, which is visible in 240 countries and territories for a period of 15 months. The data contain 123 K clicks and 4.86 M impressions for the web search and 22 K clicks and 9.07 M impressions for the image search. The proposed method for obtaining data can be applied in any other area, not only in the e-commerce industry. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Figures

Graphical abstract

Open AccessData Descriptor
A Dataset of Students’ Mental Health and Help-Seeking Behaviors in a Multicultural Environment
Received: 7 June 2019 / Revised: 14 August 2019 / Accepted: 17 August 2019 / Published: 21 August 2019
Viewed by 433 | PDF Full-text (3264 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
University students, especially international students, possess a higher risk of mental health problems than the general population. However, the literature regarding the prevalence and determinants of mental health problems as well as help-seeking behaviors of international and domestic students in Japan seems to [...] Read more.
University students, especially international students, possess a higher risk of mental health problems than the general population. However, the literature regarding the prevalence and determinants of mental health problems as well as help-seeking behaviors of international and domestic students in Japan seems to be limited. This dataset contains 268 records of depression, acculturative stress, social connectedness, and help-seeking behaviors reported by international and domestic students at an international university in Japan. One of the main findings that can be drawn from this dataset is how the level of social connectedness and acculturative stress are predictive of the reported depression among international as well as domestic students. The dataset is expected to provide reliable materials for further study of cross-cultural public health studies and policy-making in higher education. Full article
(This article belongs to the Special Issue Big Data and Digital Health)
Figures

Figure 1

Open AccessTechnical Note
dsCleaner: A Python Library to Clean, Preprocess and Convert Non-Instrusive Load Monitoring Datasets
Received: 1 July 2019 / Revised: 7 August 2019 / Accepted: 8 August 2019 / Published: 12 August 2019
Viewed by 330 | PDF Full-text (1284 KB) | HTML Full-text | XML Full-text
Abstract
Datasets play a vital role in data science and machine learning research as they serve as the basis for the development, evaluation, and benchmark of new algorithms. Non-Intrusive Load Monitoring is one of the fields that has been benefiting from the recent increase [...] Read more.
Datasets play a vital role in data science and machine learning research as they serve as the basis for the development, evaluation, and benchmark of new algorithms. Non-Intrusive Load Monitoring is one of the fields that has been benefiting from the recent increase in the number of publicly available datasets. However, there is a lack of consensus concerning how dataset should be made available to the community, thus resulting in considerable structural differences between the publicly available datasets. This technical note presents the DSCleaner, a Python library to clean, preprocess, and convert time series datasets to a standard file format. Two application examples using real-world datasets are also presented to show the technical validity of the proposed library. Full article
Figures

Figure 1

Open AccessData Descriptor
Sea Ice Climate Normals for Seasonal Ice Monitoring of Arctic and Sub-Regions
Received: 28 June 2019 / Revised: 26 July 2019 / Accepted: 6 August 2019 / Published: 10 August 2019
Viewed by 442 | PDF Full-text (3182 KB) | HTML Full-text | XML Full-text
Abstract
The climate normal, that is, the latest three full-decade average, of Arctic sea ice parameters is useful for baselining the sea ice state. A baseline ice state on both regional and local scales is important for monitoring how the current regional and local [...] Read more.
The climate normal, that is, the latest three full-decade average, of Arctic sea ice parameters is useful for baselining the sea ice state. A baseline ice state on both regional and local scales is important for monitoring how the current regional and local states depart from their normal to understand the vulnerability of marine and sea ice-based ecosystems to the changing climate conditions. Combined with up-to-date observations and reliable projections, normals are essential to business strategic planning, climate adaptation and risk mitigation. In this paper, monthly and annual climate normals of sea ice parameters (concentration, area, and extent) of the whole Arctic Ocean and 15 regional divisions are derived for the period of 1981–2010 using monthly satellite sea ice concentration estimates from a climate data record (CDR) produced by NOAA and the National Snow and Ice Data Center (NSIDC). Basic descriptions and characteristics of the normals are provided. Empirical Orthogonal Function (EOF) analysis has been utilized to describe spatial modes of sea ice concentration variability and how the corresponding principal components change over time. To provide users with basic information on data product accuracy and uncertainty, the climate normal values of Arctic sea ice extents (SIE) are compared with that of other products, including a product from NSIDC and two products from the Copernicus Climate Change Service (C3S). The SIE differences between different products are in the range of 2.3–4.5% of the CDR SIE mean. Additionally, data uncertainty estimates are represented by using the range (the difference between the maximum and minimum), standard deviation, 10th and 90th percentiles, and the first, second, and third quartile distribution of all monthly values, a distinct feature of these sea ice normal products. Full article
(This article belongs to the Special Issue Open Data and Robust & Reliable GIScience)
Figures

Graphical abstract

Open AccessArticle
Aspect Extraction from Bangla Reviews Through Stacked Auto-Encoders
Received: 30 June 2019 / Revised: 2 August 2019 / Accepted: 5 August 2019 / Published: 9 August 2019
Viewed by 327 | PDF Full-text (590 KB) | HTML Full-text | XML Full-text
Abstract
Interactions between online users are growing more and more in recent years, due to the latest developments of the web. People share online comments, opinions, and reviews about many topics. Aspect extraction is the automatic process of understanding the topic (the aspect) of [...] Read more.
Interactions between online users are growing more and more in recent years, due to the latest developments of the web. People share online comments, opinions, and reviews about many topics. Aspect extraction is the automatic process of understanding the topic (the aspect) of such comments, which has obtained huge interest from commercial and academic points of view. For instance, reviews available in webshops (like eBay, Amazon, Aliexpress, etc.) can help the customers in purchasing products and automatic analysis of reviews would be useful, as sometimes it is almost impossible to read all the available ones. In recent years, aspect extraction in the Bangla language has been regarded more and more as a task of growing importance. In the previous literature, a few methods have been introduced to classify Bangla texts according to the aspect they were focused on. This kind of research is limited mainly due to the lack of publicly available datasets for aspect extraction in the Bangla language. We take into account the only two publicly available datasets, recently published, collected for the task of aspect extraction in the Bangla language. Then, we introduce several classification methods based on stacked auto-encoders, as far as we know never exploited in the task of aspect extraction in Bangla, and we achieve better aspect classification performance with respect to the state-of-the-art: the experiments show an average improvement of 0.17 , 0.31 and 0.30 (across the two datasets), respectively in precision, recall and F1-score, reported in the state-of-the-art works that tackled the problem. Full article
Figures

Figure 1

Open AccessData Descriptor
Satellite-Based Reconstruction of the Volcanic Deposits during the December 2015 Etna Eruption
Received: 28 June 2019 / Revised: 5 August 2019 / Accepted: 6 August 2019 / Published: 8 August 2019
Viewed by 292 | PDF Full-text (2647 KB)
Abstract
Satellite-derived data, including an estimation of the eruption rate, proximal volcanic deposits and lava flow morphometric parameters (area, maximum length, thickness, and volume) are provided for the eruption that occurred at Mt Etna on 6–8 December 2015. This eruption took place at the [...] Read more.
Satellite-derived data, including an estimation of the eruption rate, proximal volcanic deposits and lava flow morphometric parameters (area, maximum length, thickness, and volume) are provided for the eruption that occurred at Mt Etna on 6–8 December 2015. This eruption took place at the New Southeast Crater (NSEC), the youngest of the summit craters of Etna, shortly after a sequence of four violent paroxysmal events took place in 65 h (3–5 December) at “Voragine”, the oldest summit crater. Multispectral SEVIRI images at 15 min sampling time have been used to compute time-averaged eruption rate curves, while tri-stereo Pléiades images, at 50 cm spatial resolution, provided the pre-eruptive topography and topographic changes due to volcanic deposits. In addition to the two types of satellite data, other parameters have been inferred, such as probable vesicularity and pyroclastic deposits. Full article
Open AccessArticle
Gifted and Talented Services for EFL Learners in China: A Step-by-Step Guide to Propensity Score Matching Analysis in R
Received: 1 July 2019 / Revised: 31 July 2019 / Accepted: 1 August 2019 / Published: 3 August 2019
Viewed by 366 | PDF Full-text (2706 KB) | HTML Full-text | XML Full-text
Abstract
We sought to quantify the effectiveness of a gifted and talented (GT) program, as was provided to university students who demonstrated a talent for learning English as a foreign language (EFL) in China. To do so, we used propensity score matching (PSM) techniques [...] Read more.
We sought to quantify the effectiveness of a gifted and talented (GT) program, as was provided to university students who demonstrated a talent for learning English as a foreign language (EFL) in China. To do so, we used propensity score matching (PSM) techniques to analyze data collected from a tier-1 university where an English talent (ET) program was provided. Specifically, we provided (a) a step-by-step guide of PSM analysis using the R analytical package, (b) the codes for PSM analysis and visualization, and (c) the final analysis of baseline equivalence and treatment effect based on the matching sample. Collectively, the results of descriptive statistics, visualization, and baseline equivalence indicate that PSM is an effective matching technique for generating an unbiased counterfactual analysis. Moreover, the ET program yields a statistically significant, positive effect on ET students’ English language proficiency. Full article
Figures

Figure 1

Open AccessData Descriptor
A Rainfall Data Intercomparison Dataset of RADKLIM, RADOLAN, and Rain Gauge Data for Germany
Received: 29 June 2019 / Revised: 23 July 2019 / Accepted: 29 July 2019 / Published: 2 August 2019
Viewed by 399 | PDF Full-text (1943 KB)
Abstract
Quantitative precipitation estimates (QPE) derived from weather radars provide spatially and temporally highly resolved rainfall data. However, they are also subject to systematic and random bias and various potential uncertainties and therefore require thorough quality checks before usage. The dataset described in this [...] Read more.
Quantitative precipitation estimates (QPE) derived from weather radars provide spatially and temporally highly resolved rainfall data. However, they are also subject to systematic and random bias and various potential uncertainties and therefore require thorough quality checks before usage. The dataset described in this paper is a collection of precipitation statistics calculated from the hourly nationwide German RADKLIM and RADOLAN QPEs provided by the German Weather Service (Deutscher Wetterdienst (DWD)), which were combined with rainfall statistics derived from rain gauge data for intercomparison. Moreover, additional information on parameters that can potentially influence radar data quality, such as the height above sea level, information on wind energy plants and the distance to the next radar station, were included in the dataset. The resulting two point shapefiles are readable with all common GIS and constitutes a spatially highly resolved rainfall statistics geodataset for the period 2006 to 2017, which can be used for statistical rainfall analyses or for the derivation of model inputs. Furthermore, the publication of this data collection has the potential to benefit other users who intend to use precipitation data for any purpose in Germany and to identify the rainfall dataset that is best suited for their application by a straightforward comparison of three rainfall datasets without any tedious data processing and georeferencing. Full article
Open AccessArticle
Paving the Way towards an Armenian Data Cube
Received: 14 June 2019 / Revised: 16 July 2019 / Accepted: 23 July 2019 / Published: 2 August 2019
Viewed by 354 | PDF Full-text (2455 KB) | HTML Full-text | XML Full-text
Abstract
Environmental issues become an increasing global concern because of the continuous pressure on natural resources. Earth observations (EO), which include both satellite/UAV and in-situ data, can provide robust monitoring for various environmental concerns. The realization of the full information potential of EO data [...] Read more.
Environmental issues become an increasing global concern because of the continuous pressure on natural resources. Earth observations (EO), which include both satellite/UAV and in-situ data, can provide robust monitoring for various environmental concerns. The realization of the full information potential of EO data requires innovative tools to minimize the time and scientific knowledge needed to access, prepare and analyze a large volume of data. EO Data Cube (DC) is a new paradigm aiming to realize it. The article presents the Swiss-Armenian joint initiative on the deployment of an Armenian DC, which is anchored on the best practices of the Swiss model. The Armenian DC is a complete and up-to-date archive of EO data (e.g., Landsat 5, 7, 8, Sentinel-2) by benefiting from Switzerland’s expertise in implementing the Swiss DC. The use-case of confirm delineation of Lake Sevan using McFeeters band ratio algorithm is discussed. The validation shows that the results are sufficiently reliable. The transfer of the necessary knowledge from Switzerland to Armenia for developing and implementing the first version of an Armenian DC should be considered as a first step of a permanent collaboration for paving the way towards continuous remote environmental monitoring in Armenia. Full article
(This article belongs to the Special Issue Earth Observation Data Cubes)
Figures

Figure 1

Open AccessData Descriptor
A High-Resolution Map of Singapore’s Terrestrial Ecosystems
Received: 13 June 2019 / Revised: 30 July 2019 / Accepted: 31 July 2019 / Published: 1 August 2019
Viewed by 612 | PDF Full-text (2771 KB) | Supplementary Files
Abstract
The natural and semi-natural areas within cities provide important refuges for biodiversity, as well as many benefits to people. To study urban ecology and quantify the benefits of urban ecosystems, we need to understand the spatial extent and configuration of different types of [...] Read more.
The natural and semi-natural areas within cities provide important refuges for biodiversity, as well as many benefits to people. To study urban ecology and quantify the benefits of urban ecosystems, we need to understand the spatial extent and configuration of different types of vegetated cover within a city. It is challenging to map urban ecosystems because they are typically small and highly fragmented; thus requiring high resolution satellite images. This article describes a new high-resolution map of land cover for the tropical city-state of Singapore. We used images from WorldView and QuickBird satellites, and classified these images using random forest machine learning and supplementary datasets into 12 terrestrial land classes. Close to 50 % of Singapore’s land cover is vegetated while freshwater fills about 6 %, and the rest is bare or built up. The overall accuracy of the map was 79 % and the class-specific errors are described in detail. Tropical regions such as Singapore have a lot of cloud cover year-round, complicating the process of mapping using satellite imagery. The land cover map provided here will have applications for urban biodiversity studies, ecosystem service quantification, and natural capital assessment. Full article
Open AccessArticle
Dynamic Data Citation Service—Subset Tool for Operational Data Management
Received: 31 May 2019 / Revised: 29 July 2019 / Accepted: 30 July 2019 / Published: 1 August 2019
Viewed by 408 | PDF Full-text (1685 KB)
Abstract
In earth observation and climatological sciences, data and their data services grow on a daily basis in a large spatial extent due to the high coverage rate of satellite sensors, model calculations, but also by continuous meteorological in situ observations. In order to [...] Read more.
In earth observation and climatological sciences, data and their data services grow on a daily basis in a large spatial extent due to the high coverage rate of satellite sensors, model calculations, but also by continuous meteorological in situ observations. In order to reuse such data, especially data fragments as well as their data services in a collaborative and reproducible manner by citing the origin source, data analysts, e.g., researchers or impact modelers, need a possibility to identify the exact version, precise time information, parameter, and names of the dataset used. A manual process would make the citation of data fragments as a subset of an entire dataset rather complex and imprecise to obtain. Data in climate research are in most cases multidimensional, structured grid data that can change partially over time. The citation of such evolving content requires the approach of “dynamic data citation”. The applied approach is based on associating queries with persistent identifiers. These queries contain the subsetting parameters, e.g., the spatial coordinates of the desired study area or the time frame with a start and end date, which are automatically included in the metadata of the newly generated subset and thus represent the information about the data history, the data provenance, which has to be established in data repository ecosystems. The Research Data Alliance Data Citation Working Group (RDA Data Citation WG) summarized the scientific status quo as well as the state of the art from existing citation and data management concepts and developed the scalable dynamic data citation methodology of evolving data. The Data Centre at the Climate Change Centre Austria (CCCA) has implemented the given recommendations and offers since 2017 an operational service on dynamic data citation on climate scenario data. With the consciousness that the objective of this topic brings a lot of dependencies on bibliographic citation research which is still under discussion, the CCCA service on Dynamic Data Citation focused on the climate domain specific issues, like characteristics of data, formats, software environment, and usage behavior. The current effort beyond spreading made experiences will be the scalability of the implementation, e.g., towards the potential of an Open Data Cube solution. Full article
(This article belongs to the Special Issue Earth Observation Data Cubes)
Open AccessData Descriptor
A New Multi-Temporal Forest Cover Classification for the Xingu River Basin, Brazil
Received: 26 June 2019 / Revised: 30 July 2019 / Accepted: 30 July 2019 / Published: 1 August 2019
Viewed by 325 | PDF Full-text (4499 KB) | HTML Full-text | XML Full-text
Abstract
We describe a new multi-temporal classification for forest/non-forest classes for a 1.3 million square kilometer area encompassing the Xingu River basin, Brazil. This region is well known for its exceptionally high biodiversity, especially in terms of the ichthyofauna, with approximately 600 known species, [...] Read more.
We describe a new multi-temporal classification for forest/non-forest classes for a 1.3 million square kilometer area encompassing the Xingu River basin, Brazil. This region is well known for its exceptionally high biodiversity, especially in terms of the ichthyofauna, with approximately 600 known species, 10% of which are endemic to the river basin. Global and regional scale datasets do not adequately capture the rapidly changing land cover in this region. Accurate forest cover and forest cover change data are important for understanding the anthropogenic pressures on the aquatic ecosystems. We developed the new classifications with a minimum mapping unit of 0.8 ha from cloud free mosaics of Landsat TM5 and OLI 8 imagery in Google Earth Engine using a classification and regression tree (CART) aided by field photographs for the selection of training and validation points. Full article
Figures

Figure 1

Open AccessFeature PaperArticle
Paving the Way to Increased Interoperability of Earth Observations Data Cubes
Received: 14 June 2019 / Revised: 26 July 2019 / Accepted: 27 July 2019 / Published: 30 July 2019
Viewed by 590 | PDF Full-text (4465 KB) | HTML Full-text | XML Full-text
Abstract
Earth observations data cubes (EODCs) are a paradigm transforming the way users interact with large spatio-temporal Earth observation (EO) data. It enhances connections between data, applications and users facilitating management, access and use of analysis ready data (ARD). The ambition is allowing users [...] Read more.
Earth observations data cubes (EODCs) are a paradigm transforming the way users interact with large spatio-temporal Earth observation (EO) data. It enhances connections between data, applications and users facilitating management, access and use of analysis ready data (ARD). The ambition is allowing users to harness big EO data at a minimum cost and effort. This significant interest is illustrated by various implementations that exist. The novelty of the approach results in different innovative solutions and the lack of commonly agreed definition of EODC. Consequently, their interoperability has been recognized as a major challenge for the global change and Earth system science domains. The objective of this paper is preventing EODC from becoming silos of information; to present how interoperability can be enabled using widely-adopted geospatial standards; and to contribute to the debate of enhanced interoperability of EODC. We demonstrate how standards can be used, profiled and enriched to pave the way to increased interoperability of EODC and can help delivering and leveraging the power of EO data building, efficient discovery, access and processing services. Full article
(This article belongs to the Special Issue Earth Observation Data Cubes)
Figures

Figure 1

Open AccessArticle
Catastrophic Household Expenditure for Healthcare in Turkey: Clustering Analysis of Categorical Data
Received: 24 June 2019 / Revised: 21 July 2019 / Accepted: 27 July 2019 / Published: 29 July 2019
Viewed by 350 | PDF Full-text (1765 KB) | HTML Full-text | XML Full-text
Abstract
The amount of health expenditure at the household level is one of the most basic indicators of development in countries. In many countries, health expenditure increases relative to national income. If out-of-pocket health spending is higher than the income or too high, this [...] Read more.
The amount of health expenditure at the household level is one of the most basic indicators of development in countries. In many countries, health expenditure increases relative to national income. If out-of-pocket health spending is higher than the income or too high, this indicates an economical alarm that causes a lower life standard, called catastrophic health expenditure. Catastrophic expenditure may be affected by many factors such as household type, property status, smoking and drinking alcohol habits, being active in sports, and having private health insurance. The study aims to investigate households with respect to catastrophic health expenditure by the clustering method. Clustering enables one to see the main similarity and difference between the groups. The results show that there are significant and interesting differences between the five groups. C4 households earn more but spend less money on health problems by the rate of 3.10% because people who do physical exercises regularly have fewer health problems. A household with a family with one adult, landlord and three people in total (mother or father and two children) in the cluster C5 earns much money and spends large amounts for health expenses than other clusters. C1 households with elementary families with three children, and who do not pay rent although they are not landlords have the highest catastrophic health expenditure. Households in C3 have a rate of 3.83% health expenditure rate on average, which is higher than other clusters. Households in the cluster C2 make the most catastrophic health expenditure. Full article
(This article belongs to the Special Issue Data-Driven Healthcare Tasks: Tools, Frameworks, and Techniques)
Figures

Figure 1

Open AccessData Descriptor
TIRF Microscope Image Sequences of Fluorescent IgE-FcεRI Receptor Complexes inside a FcεRI-Centric Synapse in RBL-2H3 Cells
Received: 23 May 2019 / Revised: 23 July 2019 / Accepted: 24 July 2019 / Published: 28 July 2019
Viewed by 358 | PDF Full-text (2655 KB) | HTML Full-text | XML Full-text
Abstract
Total internal reflection fluorescence (TIRF) microscope image sequences are commonly used to study receptors in live cells. The dataset presented herein facilitates the study of the IgE-FcεRI receptor signaling complex (IgE-RC) in rat basophilic leukemia (RBL-2H3) cells coming into contact with a supported [...] Read more.
Total internal reflection fluorescence (TIRF) microscope image sequences are commonly used to study receptors in live cells. The dataset presented herein facilitates the study of the IgE-FcεRI receptor signaling complex (IgE-RC) in rat basophilic leukemia (RBL-2H3) cells coming into contact with a supported lipid bilayer with 25 mol% N-dinitrophenyl-aminocaproyl phosphatidylethanolamine, modeling an immunological synapse. TIRF microscopy was used to image IgE-RCs within this FcεRI-centric synapse by loading RBL-2H3 cells with fluorescent anti-dinitrophenyl (anti-DNP) immunoglobulin E (IgE) in suspension for 24 h. Fluorescent anti-DNP IgE (IgE488) concentrations of this suspension increased from 10% to 100% and corresponding non-fluorescent anti-DNP IgE concentrations decreased from 90% to 0%. After the removal of unbound anti-DNP IgE, multiple image sequences were taken for each of these ten conditions. Prior to imaging, anti-DNP IgE-primed RBL-2H3 cells were either kept for a few minutes, for about 30 min, or for about one hour in Hanks buffer. The dataset contains 482 RBL-2H3 model synapse image stacks, dark images to correct for background intensity, and TIRF illumination profile images to correct for non-uniform TIRF illumination. After background subtraction, non-uniform illumination correction, and conversion of pixel units from analog-to-digital units to photo electrons, the average pixel intensity was calculated. The average pixel intensity within FcεRI-centric synapses for all three Hanks buffer conditions increased linearly at a rate of 0.42 ± 0.02 photo electrons per pixel per % IgE488 in suspension. RBL-2H3 cell degranulation was tested by detecting β-hexosaminidase activity. Prolonged RBL-2H3 cell exposure to Hanks buffer inhibited exocytosis in RBL-2H3 cells. Full article
Figures

Figure 1

Open AccessReview
Reinforcement Learning in Financial Markets
Received: 30 June 2019 / Revised: 23 July 2019 / Accepted: 26 July 2019 / Published: 28 July 2019
Viewed by 448 | PDF Full-text (575 KB) | HTML Full-text | XML Full-text
Abstract
Recently there has been an exponential increase in the use of artificial intelligence for trading in financial markets such as stock and forex. Reinforcement learning has become of particular interest to financial traders ever since the program AlphaGo defeated the strongest human contemporary [...] Read more.
Recently there has been an exponential increase in the use of artificial intelligence for trading in financial markets such as stock and forex. Reinforcement learning has become of particular interest to financial traders ever since the program AlphaGo defeated the strongest human contemporary Go board game player Lee Sedol in 2016. We systematically reviewed all recent stock/forex prediction or trading articles that used reinforcement learning as their primary machine learning method. All reviewed articles had some unrealistic assumptions such as no transaction costs, no liquidity issues and no bid or ask spread issues. Transaction costs had significant impacts on the profitability of the reinforcement learning algorithms compared with the baseline algorithms tested. Despite showing statistically significant profitability when reinforcement learning was used in comparison with baseline models in many studies, some showed no meaningful level of profitability, in particular with large changes in the price pattern between the system training and testing data. Furthermore, few performance comparisons between reinforcement learning and other sophisticated machine/deep learning models were provided. The impact of transaction costs, including the bid/ask spread on profitability has also been assessed. In conclusion, reinforcement learning in stock/forex trading is still in its early development and further research is needed to make it a reliable method in this domain. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Figures

Figure 1

Open AccessArticle
Prediction of Fault Fix Time Transition in Large-Scale Open Source Project Data
Received: 28 June 2019 / Revised: 23 July 2019 / Accepted: 24 July 2019 / Published: 27 July 2019
Viewed by 355 | PDF Full-text (1086 KB)
Abstract
Open source software (OSS) programs are adopted as embedded systems regarding their server usage, due to their quick delivery, cost reduction, and standardization of systems. Many OSS programs are developed using the peculiar style known as the bazaar method, in which faults are [...] Read more.
Open source software (OSS) programs are adopted as embedded systems regarding their server usage, due to their quick delivery, cost reduction, and standardization of systems. Many OSS programs are developed using the peculiar style known as the bazaar method, in which faults are detected and fixed by developers around the world, and the result is then reflected in the next release. Furthermore, the fix time of faults tends to be shorter as the development of the OSS progresses. However, several large-scale open source projects encounter the problem that fault fixing takes much time because the fault corrector cannot handle many fault reports. Therefore, OSS users and project managers need to know the stability degree of open source projects by determining the fault fix time. In this paper, we predict the transition of the fix time in large-scale open source projects. To make the prediction, we use the software reliability growth model based on the Wiener process considering that the fault fix time in open source projects changes depending on various factors such as the fault reporting time and the assignees to fix the faults. In addition, we discuss the assumption that fault fix time data depend on the prediction of the transition in fault fixing time. Full article
Open AccessArticle
Urban Mobility Demand Profiles: Time Series for Cars and Bike-Sharing Use as a Resource for Transport and Energy Modeling
Received: 20 June 2019 / Revised: 19 July 2019 / Accepted: 25 July 2019 / Published: 26 July 2019
Viewed by 351 | PDF Full-text (1012 KB) | Supplementary Files
Abstract
The transport sector is currently facing a significant transition, with strong drivers including
decarbonization and digitalization trends, especially in urban passenger transport. The availability of
monitoring data is at the basis of the development of optimization models supporting an enhanced
urban mobility, with [...] Read more.
The transport sector is currently facing a significant transition, with strong drivers including
decarbonization and digitalization trends, especially in urban passenger transport. The availability of
monitoring data is at the basis of the development of optimization models supporting an enhanced
urban mobility, with multiple benefits including lower pollutants and CO2 emissions, lower energy
consumption, better transport management and land space use. This paper presents two datasets
that represent time series with a high temporal resolution (five-minute time step) both for vehicles
and bike sharing use in the city of Turin, located in Northern Italy. These high-resolution profiles
have been obtained by the collection and elaboration of available online resources providing live
information on traffic monitoring and bike sharing docking stations. The data are provided for the
entire year 2018, and they represent an interesting basis for the evaluation of seasonal and daily
variability patterns in urban mobility. These data may be used for different applications, ranging
from the chronological distribution of mobility demand, to the estimation of passenger transport
flows for the development of transport models in urban contexts. Moreover, traffic profiles are at the
basis for the modeling of electric vehicles charging strategies and their interaction with the power
grid. Full article
Open AccessData Descriptor
Internal Seed Structure of Alpine Plants and Extreme Cold Exposure
Received: 30 June 2019 / Revised: 20 July 2019 / Accepted: 22 July 2019 / Published: 24 July 2019
Viewed by 547 | PDF Full-text (1145 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Cold tolerance in seeds is not well understood compared to mechanisms in aboveground plant tissue but is crucial to understanding how plant populations persist in extreme cold conditions. Counter-intuitively, the ability of seeds to survive extreme cold may become more important in the [...] Read more.
Cold tolerance in seeds is not well understood compared to mechanisms in aboveground plant tissue but is crucial to understanding how plant populations persist in extreme cold conditions. Counter-intuitively, the ability of seeds to survive extreme cold may become more important in the future due to climate change projections. This is due to the loss of the insulating snow bed resulting in the actual temperatures experienced at soil surface level being much colder than without snow cover. Seed survival in extremely low temperatures is conferred by mechanisms that can be divided into freezing avoidance and freezing tolerance depending on the location of ice crystal formation within the seed. We present a dataset of alpine angiosperm species with seed mass and seed structure defined as endospermic and non-endospermic. This is presented alongside the locations of temperature minima per species which can be used to examine the extent to which different seed structures are associated with snow cover. We hope that the dataset can be used by others to demonstrate if certain seed structures and sizes are associated with snow cover, and if so, would they be negatively impacted by the loss of snow resulting from climate change. Full article
Figures

Figure 1

Open AccessData Descriptor
Scots Pine Seedlings Growth Dynamics Data Reveals Properties for the Future Proof of Seed Coat Color Grading Conjecture
Received: 1 July 2019 / Revised: 22 July 2019 / Accepted: 22 July 2019 / Published: 23 July 2019
Viewed by 417 | PDF Full-text (868 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Seed coat color grading conjecture is also known as Pravdin’s conjecture. To verify the conjecture, we established a long-term field experiment. This data set included unique empirical data of Scots pine (Pinus sylvestris L.) container-grown seedlings produced from different seed color grades, [...] Read more.
Seed coat color grading conjecture is also known as Pravdin’s conjecture. To verify the conjecture, we established a long-term field experiment. This data set included unique empirical data of Scots pine (Pinus sylvestris L.) container-grown seedlings produced from different seed color grades, outplanted on a post fire site in the Voronezh region, Russia. Variables were provided for 10 rows of 90 samples in each row. These data contribute to our understanding of seed germination and seedlings growth dynamics from size and color gradings of seeds. This structure is the future basis of the Forest Reproductive Material Library (FRMLib) and will be used for assisted migration and forest seed transfer. Full article
Figures

Graphical abstract

Open AccessData Descriptor
Building Stock and Building Typology of Kigali, Rwanda
Received: 5 June 2019 / Revised: 9 July 2019 / Accepted: 19 July 2019 / Published: 21 July 2019
Viewed by 446 | PDF Full-text (625 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
This study uses very high-resolution Pléiades imagery for the densely built-up central part of the City of Kigali for the year 2015 in order to derive urban morphology data on building footprints, building archetypes and building heights. Aerial images of the study area [...] Read more.
This study uses very high-resolution Pléiades imagery for the densely built-up central part of the City of Kigali for the year 2015 in order to derive urban morphology data on building footprints, building archetypes and building heights. Aerial images of the study area from 2008–2009 were used in combination with the 2015 dataset to create a change monitoring dataset on a single building basis. A semi-automated approach was chosen which combined an object-based image analysis with an expert-based revision. The result is a geospatial dataset that detects 165,625 buildings for 2008–2009 and 211,458 for 2015. The dataset includes information on the type of changes between the two dates. Analysis of this geospatial dataset can be used for a range of research applications in economics and the social sciences, as well as a range of policy applications in urban planning and municipal finance administration. Full article
Figures

Graphical abstract

Open AccessData Descriptor
Towards the Fulfillment of a Knowledge Gap: Wood Densities for Species of the Subtropical Atlantic Forest
Received: 27 June 2019 / Revised: 16 July 2019 / Accepted: 18 July 2019 / Published: 20 July 2019
Viewed by 431 | PDF Full-text (2649 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Wood density (ρ) is a trait involved in forest biomass estimates, forest ecology, prediction of stand stability, wood science, and engineering. Regardless of its importance, data on ρ are scarce for a substantial number of species of the vast Atlantic Forest [...] Read more.
Wood density ( ρ ) is a trait involved in forest biomass estimates, forest ecology, prediction of stand stability, wood science, and engineering. Regardless of its importance, data on ρ are scarce for a substantial number of species of the vast Atlantic Forest phytogeographic domain. Given that, the present paper describes a dataset composed of three data tables: (i) determinations of ρ (kg m−3) for 153 species growing in three forest types within the subtropical Atlantic Forest, based on wood samples collected throughout the state of Santa Catarina, southern Brazil; (ii) a list of 719 tree/shrub species observed by a state-level forest inventory and a ρ value assigned to each one of them based on local determinations and on a global database; (iii) the means and standard deviations of ρ for 477 permanent sample plots located in the subtropical Atlantic Forest, covering ∼95,000 km2. The mean ρ over the 153 sampled species is 538.6 kg m−3 (standard deviation = 120.5 kg m−3), and the mean ρ per sample plot, considering the three forest types, is 525.0 kg m−3 (standard error = 1.8 kg m−3). The described dataset has potential to underpin studies on forest biomass, forest ecology, alternative uses of timber resources, as well as to enlarge the coverage of global datasets. Full article
(This article belongs to the Special Issue Forest Monitoring Systems and Assessments at Multiple Scales)
Figures

Figure 1

Open AccessData Descriptor
Correlations between Environmental Factors and Milk Production of Holstein Cows
Received: 5 July 2019 / Revised: 14 July 2019 / Accepted: 16 July 2019 / Published: 19 July 2019
Viewed by 448 | PDF Full-text (355 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Global climate change is a challenge for dairy farming. In this regard, identifying reliable correlations between environmental parameters and animals’ physiological responses is a starting point for the mathematical modeling of their effects on the future welfare and milk production of cows. The [...] Read more.
Global climate change is a challenge for dairy farming. In this regard, identifying reliable correlations between environmental parameters and animals’ physiological responses is a starting point for the mathematical modeling of their effects on the future welfare and milk production of cows. The aim of the study was to examine the relationship between environmental parameters and the milk production of cows in hot period. Archival data from the Ukrainian Hydrometeorological Center were used to study the state of insolation conditions (IC), wind direction (WD), wind strength (WS), air temperature (AT), and relative humidity (RH). The temperature–humidity index (THI) (Kibler, 1964) and temperature–humidity index in the hangar-type cowshed (THICHT) (Mylostyvyi et al., 2019) served as integral indicators of the state of the cowshed’s microclimate. The daily milk yield (DMY), yield of milk fat (MF) and milk protein (MP), and percentage of milk fat (PMF) and protein (PMP) were taken into account by the DairyComp 305 herd management system (VAS, USA). Statistical data processing was performed using the mathematical functions of Microsoft Excel (Microsoft Inc.) and Statistica 10 (StatSoft Inc.). There was a weak correlation between IC and DMY at r = −0.2, between RH and DMY at r = +0.4, and between RH and MF at r = +0.2. Between DMY, MF, MP, and WS made up r = –0.2 to 0.4. Between DMY, MF, MP, and AT made up r = −0.2 to 0.5 (p < 0.05). The effects of weather factors on animal productivity will be the subject of further research. Full article
Figures

Figure 1

Open AccessArticle
Semantic Earth Observation Data Cubes
Received: 15 June 2019 / Revised: 12 July 2019 / Accepted: 15 July 2019 / Published: 17 July 2019
Viewed by 462 | PDF Full-text (10075 KB) | HTML Full-text | XML Full-text
Abstract
There is an increasing amount of free and open Earth observation (EO) data, yet more information is not necessarily being generated from them at the same rate despite high information potential. The main challenge in the big EO analysis domain is producing information [...] Read more.
There is an increasing amount of free and open Earth observation (EO) data, yet more information is not necessarily being generated from them at the same rate despite high information potential. The main challenge in the big EO analysis domain is producing information from EO data, because numerical, sensory data have no semantic meaning; they lack semantics. We are introducing the concept of a semantic EO data cube as an advancement of state-of-the-art EO data cubes. We define a semantic EO data cube as a spatio-temporal data cube containing EO data, where for each observation at least one nominal (i.e., categorical) interpretation is available and can be queried in the same instance. Here we clarify and share our definition of semantic EO data cubes, demonstrating how they enable different possibilities for data retrieval, semantic queries based on EO data content and semantically enabled analysis. Semantic EO data cubes are the foundation for EO data expert systems, where new information can be inferred automatically in a machine-based way using semantic queries that humans understand. We argue that semantic EO data cubes are better positioned to handle current and upcoming big EO data challenges than non-semantic EO data cubes, while facilitating an ever-diversifying user-base to produce their own information and harness the immense potential of big EO data. Full article
(This article belongs to the Special Issue Earth Observation Data Cubes)
Figures

Figure 1

Open AccessArticle
Feedforward Neural Network-Based Architecture for Predicting Emotions from Speech
Received: 21 May 2019 / Revised: 9 July 2019 / Accepted: 11 July 2019 / Published: 15 July 2019
Viewed by 467 | PDF Full-text (4815 KB) | HTML Full-text | XML Full-text
Abstract
We propose a novel feedforward neural network (FFNN)-based speech emotion recognition system built on three layers: A base layer where a set of speech features are evaluated and classified; a middle layer where a speech matrix is built based on the classification scores [...] Read more.
We propose a novel feedforward neural network (FFNN)-based speech emotion recognition system built on three layers: A base layer where a set of speech features are evaluated and classified; a middle layer where a speech matrix is built based on the classification scores computed in the base layer; a top layer where an FFNN- and a rule-based classifier are used to analyze the speech matrix and output the predicted emotion. The system offers 80.75% accuracy for predicting the six basic emotions and surpasses other state-of-the-art methods when tested on emotion-stimulated utterances. The method is robust and the fastest in the literature, computing a stable prediction in less than 78 s and proving attractive for replacing questionnaire-based methods and for real-time use. A set of correlations between several speech features (intensity contour, speech rate, pause rate, and short-time energy) and the evaluated emotions is determined, which enhances previous similar studies that have not analyzed these speech features. Using these correlations to improve the system leads to a 6% increase in accuracy. The proposed system can be used to improve human–computer interfaces, in computer-mediated education systems, for accident prevention, and for predicting mental disorders and physical diseases. Full article
Figures

Graphical abstract

Open AccessArticle
Building a SAR-Enabled Data Cube Capability in Australia Using SAR Analysis Ready Data
Received: 28 May 2019 / Revised: 10 July 2019 / Accepted: 12 July 2019 / Published: 15 July 2019
Viewed by 415 | PDF Full-text (4525 KB) | HTML Full-text | XML Full-text
Abstract
A research alliance between the Commonwealth Scientific and Industrial Research Organization and Geoscience Australia was established in relation to Digital Earth Australia, to develop a Synthetic Aperture Radar (SAR)-enabled Data Cube capability for Australia. This project has been developing SAR analysis ready data [...] Read more.
A research alliance between the Commonwealth Scientific and Industrial Research Organization and Geoscience Australia was established in relation to Digital Earth Australia, to develop a Synthetic Aperture Radar (SAR)-enabled Data Cube capability for Australia. This project has been developing SAR analysis ready data (ARD) products, including normalized radar backscatter (gamma nought, γ0), eigenvector-based dual-polarization decomposition and interferometric coherence, all generated from the European Space Agency (ESA) Sentinel-1 interferometric wide swath mode data available on the Copernicus Australasia Regional Data Hub. These are produced using the open source ESA SNAP toolbox. The processing workflows are described, along with a comparison of the γ0 backscatter and interferometric coherence ARD produced using SNAP and the proprietary software GAMMA. This comparison also evaluates the effects on γ0 backscatter due to variations related to: Near- and far-range look angles; SNAP’s default Shuttle Radar Topography Mission (SRTM) DEM and a refined Australia-wide DEM; as well as terrain. The agreement between SNAP and GAMMA is generally good, but also presents some systematic geometric and radiometric differences. The difference between SNAP’s default SRTM DEM and the refined DEM showed a small geometric shift along the radar view direction. The systematic geometric and radiometric issues detected can however be expected to have negligible effects on analysis, provided products from the two processors and two DEMs are used separately and not mixed within the same analysis. The results lead to the conclusion that the SNAP toolbox is suitable for producing the Sentinel-1 ARD products. Full article
(This article belongs to the Special Issue Earth Observation Data Cubes)
Figures

Figure 1

Open AccessData Descriptor
System for Collecting, Processing, Visualization, and Storage of the MT-Monitoring Data
Received: 20 May 2019 / Revised: 3 July 2019 / Accepted: 12 July 2019 / Published: 14 July 2019
Viewed by 398 | PDF Full-text (4820 KB) | HTML Full-text | XML Full-text
Abstract
On the basis of the Research Station of the Russian Academy of Sciences in Bishkek, a unique scientific infrastructure—a complex geophysical station—is successfully functioning, realizing a monitoring of geodynamic processes, which includes research on the network of points of seismological, geodesic, and electromagnetic [...] Read more.
On the basis of the Research Station of the Russian Academy of Sciences in Bishkek, a unique scientific infrastructure—a complex geophysical station—is successfully functioning, realizing a monitoring of geodynamic processes, which includes research on the network of points of seismological, geodesic, and electromagnetic observations on the territory of the Bishkek Geodynamic Proving Ground located in the seismically active zone of the Northern Tien Shan. The scientific and practical importance of monitoring the geodynamical activity of the Earth’s crust takes place not only in seismically active regions, but also in the areas of the location of particularly important objects, mining, and hazardous industries. Therefore, it seems highly relevant to create new software and hardware to study geodynamic processes in the earth’s crust of seismically active zones, based on integrated monitoring of the geological environment in the widest possible depth range. The use of modern information technology in such studies provides an effective data management tool. The considering system for collecting, processing, and storing monitoring electromagnetic data of the Bishkek geodynamic proving ground can help overcome the scarcity of experimental data in the field of Earth sciences. Full article
(This article belongs to the Special Issue Overcoming Data Scarcity in Earth Science)
Figures

Figure 1

Open AccessData Descriptor
Wild Bee Toxicity Data for Pesticide Risk Assessments
Received: 20 June 2019 / Revised: 6 July 2019 / Accepted: 9 July 2019 / Published: 11 July 2019
Viewed by 361 | PDF Full-text (509 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Pollination services are vital for agriculture, food security and biodiversity. Although many insect species provide pollination services, honeybees are thought to be the major provider of this service to agriculture. However, the importance of wild bees in this respect should not be overlooked. [...] Read more.
Pollination services are vital for agriculture, food security and biodiversity. Although many insect species provide pollination services, honeybees are thought to be the major provider of this service to agriculture. However, the importance of wild bees in this respect should not be overlooked. Whilst regulatory risk assessment processes have, for a long time, included that for pollinators, using honeybees (Apis mellifera) as a protective surrogate, there are concerns that this approach may not be sufficiently adequate particularly because of global declines in pollinating insects. Consequently, risk assessments are now being expanded to include wild bee species such as bumblebees (Bombus spp.) and solitary bees (Osmia spp.). However, toxicity data for these species is scarce and are absent from the main pesticide reference resources. The aim of the study described here was to collate data relating to the acute toxicity of pesticides to wild bee species (both topical and dietary exposure) from published regulatory documents and peer reviewed literature, and to incorporate this into one of the main online resources for pesticide risk assessment data: The Pesticide Properties Database, thus ensuring that the data is maintained and continuously kept up to date. The outcome of this study is a dataset collated from 316 regulatory and peer reviewed articles that contains 178 records covering 120 different pesticides and their variants which includes 142 records for bumblebees and a further 115 records for other wild bee species. Full article
Figures

Figure 1

Data EISSN 2306-5729 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top