Open AccessArticle
Using Semantic Web Technologies to Query and Manage Information within Federated Cyber-Infrastructures
Data 2017, 2(3), 21; doi:10.3390/data2030021 (registering DOI) -
Abstract
A standardized descriptive ontology supports efficient querying and manipulation of data from heterogeneous sources across boundaries of distributed infrastructures, particularly in federated environments. In this article, we present the Open-Multinet (OMN) set of ontologies, which were designed specifically for this purpose as well
[...] Read more.
A standardized descriptive ontology supports efficient querying and manipulation of data from heterogeneous sources across boundaries of distributed infrastructures, particularly in federated environments. In this article, we present the Open-Multinet (OMN) set of ontologies, which were designed specifically for this purpose as well as to support management of life-cycles of infrastructure resources. We present their initial application in Future Internet testbeds, their use for representing and requesting available resources, and our experimental performance evaluation of the ontologies in terms of querying and translation times. Our results highlight the value and applicability of Semantic Web technologies in managing resources of federated cyber-infrastructures. Full article
Figures

Figure 1

Open AccessArticle
Open Source Fundamental Industry Classification
Data 2017, 2(2), 20; doi:10.3390/data2020020 -
Abstract
Abstract: We provide complete source code for building a fundamental industry classification based on publicly available and freely downloadable data. We compare various fundamental industry classifications by running a horserace of short-horizon trading signals (alphas) utilizing open source heterotic risk models (https://ssrn.com/abstract=2600798)
[...] Read more.
Abstract: We provide complete source code for building a fundamental industry classification based on publicly available and freely downloadable data. We compare various fundamental industry classifications by running a horserace of short-horizon trading signals (alphas) utilizing open source heterotic risk models (https://ssrn.com/abstract=2600798) built using such industry classifications. Our source code includes various stand-alone and portable modules, e.g., for downloading/parsing web data, etc. Full article
Figures

Figure 1

Open AccessData Descriptor
Four Datasets Derived from an Archive of Personal Homepages (1995–2009)
Data 2017, 2(2), 19; doi:10.3390/data2020019 -
Abstract
While data from social media are easily accessible, understanding how individuals expressed themselves on the Internet in its initial years of public availability (the mid-late 1990s) has proved difficult. In this data deposit, I describe how archival data from Geocities homepages were retrieved
[...] Read more.
While data from social media are easily accessible, understanding how individuals expressed themselves on the Internet in its initial years of public availability (the mid-late 1990s) has proved difficult. In this data deposit, I describe how archival data from Geocities homepages were retrieved and processed to remove non-text data, then further refined to create separate datasets, each of which provides unique insights into modes of personal expression on the early Internet. The present paper describes four datasets, all of which were derived from a larger collection of personal websites: (1) a large corpus of raw text data from Geocities personal homepages, (2) a linguistic analysis of basic psychological properties of the same Geocities pages, using an open-source implementation of the Linguistic Inquiry Word Count (LIWC), (3) a dataset of links between homepages (suitable for network analysis), and (4) a manifest dataset summarizing the size and last update date for each file in the dataset. Data from over 378,000 Geocities pages are included. In addition to providing a detailed description of how these datasets were created, I describe how they might be utilized in future research. Full article
Figures

Figure 1

Open AccessData Descriptor
Towards Automatic Bird Detection: An Annotated and Segmented Acoustic Dataset of Seven Picidae Species
Data 2017, 2(2), 18; doi:10.3390/data2020018 -
Abstract
Analysing behavioural patterns of bird species in a certain region enables researchers to recognize forthcoming changes in environment, ecology, and population. Ornithologists spend many hours observing and recording birds in their natural habitat to compare different audio samples and extract valuable insights. This
[...] Read more.
Analysing behavioural patterns of bird species in a certain region enables researchers to recognize forthcoming changes in environment, ecology, and population. Ornithologists spend many hours observing and recording birds in their natural habitat to compare different audio samples and extract valuable insights. This manual process is typically undertaken by highly-experienced birders that identify every species and its associated type of sound. In recent years, some public repositories hosting labelled acoustic samples from different bird species have emerged, which has resulted in appealing datasets that computer scientists can use to test the accuracy of their machine learning algorithms and assist ornithologists in the time-consuming process of analyzing audio data. Current limitations in the performance of these algorithms come from the fact that the acoustic samples of these datasets combine fragments with only environmental noise and fragments with the bird sound (i.e., the computer confuses environmental sound with the bird sound). Therefore, the purpose of this paper is to release a dataset lasting more than 4984 s that contains differentiated samples of (1) bird sounds and (2) environmental sounds. This data descriptor releases the processed audio samples—originally obtained from the Xeno-Canto repository—from the known seven families of the Picidae species inhabiting the Iberian Peninsula that are good indicators of the habitat quality and have significant value from the environment conservation point of view. Full article
Figures

Figure 1

Open AccessData Descriptor
Transcriptome Dataset of Soybean (Glycine max) Grown under Phosphorus-Deficient and -Sufficient Conditions
Data 2017, 2(2), 17; doi:10.3390/data2020017 -
Abstract
This data descriptor introduces the dataset of the transcriptome of low-phosphorus tolerant soybean (Glycine max) variety NN94-156 under phosphorus-deficient and -sufficient conditions. This data is comprised of the transcriptome datasets (four libraries) acquired from roots and leaves of the soybean plants
[...] Read more.
This data descriptor introduces the dataset of the transcriptome of low-phosphorus tolerant soybean (Glycine max) variety NN94-156 under phosphorus-deficient and -sufficient conditions. This data is comprised of the transcriptome datasets (four libraries) acquired from roots and leaves of the soybean plants challenged with low-phosphorus, which allows further analysis whether systemic tolerance response to low phosphorus stress occurred. We describe the detailed procedure of how plants were prepared and treated and how the data were generated and pre-processed. Further analyses of this data would be helpful to improve our understanding of molecular mechanisms of low-phosphorus stress in soybean. Full article
Open AccessData Descriptor
Long-Term Land Cover Data for the Lower Peninsula of Michigan, 2010–2050
Data 2017, 2(2), 16; doi:10.3390/data2020016 -
Abstract
Land cover data are often used to examine the impacts of landscape alterations on the environment from the local to global scale. Although various agencies produce land cover data at various spatial scales, data are still limited at the regional scale over extended
[...] Read more.
Land cover data are often used to examine the impacts of landscape alterations on the environment from the local to global scale. Although various agencies produce land cover data at various spatial scales, data are still limited at the regional scale over extended timescales. This is a critical data gap since decision-makers often use future and long-term land cover maps to develop effective policies for sustainable environmental systems. As a result, land change science incorporates common data mining tools to create future land cover maps that extend over long timescales. This study applied one of the well-known land cover change models, called Land Transformation Model (LTM), to produce urbanization maps for the Lower Peninsula of Michigan in United States from 2010 to 2050 with five year intervals. Long-term urbanization data in the Lower Peninsula of Michigan can be used in various environmental studies such as assessing the impact of future urbanization on climate change, water quality, food security and biodiversity. Full article
Figures

Figure 1

Open AccessArticle
Demonstration Study: A Protocol to Combine Online Tools and Databases for Identifying Potentially Repurposable Drugs
Data 2017, 2(2), 15; doi:10.3390/data2020015 -
Abstract
Traditional methods for discovery and development of new drugs can be very time-consuming and expensive processes because they include several stages, such as compound identification, pre-clinical and clinical trials before the drug is approved by the U.S. Food and Drug Administration (FDA). Therefore,
[...] Read more.
Traditional methods for discovery and development of new drugs can be very time-consuming and expensive processes because they include several stages, such as compound identification, pre-clinical and clinical trials before the drug is approved by the U.S. Food and Drug Administration (FDA). Therefore, drug repurposing, namely using currently FDA-approved drugs as therapeutics for other diseases than what they are originally prescribed for, is emerging to be a faster and more cost-effective alternative to current drug discovery methods. In this paper, we have described a three-step in silico protocol for analyzing transcriptomics data using online databases and bioinformatics tools for identifying potentially repurposable drugs. The efficacy of this protocol was evaluated by comparing its predictions with the findings of two case studies of recently reported repurposed drugs: HIV treating drug zidovudine for the treatment of dry age-related macular degeneration and the antidepressant imipramine for small-cell lung carcinoma. The proposed protocol successfully identified the published findings, thus demonstrating the efficacy of this method. In addition, it also yielded several novel predictions that have not yet been published, including the finding that imipramine could potentially treat Severe Acute Respiratory Syndrome (SARS), a disease that currently does not have any treatment or vaccine. Since this in silico protocol is simple to use and does not require advanced computer skills, we believe any motivated participant with access to these databases and tools would be able to apply it to large datasets to identify other potentially repurposable drugs in the future. Full article
Figures

Figure 1

Open AccessData Descriptor
CHASE-PL—Future Hydrology Data Set: Projections of Water Balance and Streamflow for the Vistula and Odra Basins, Poland
Data 2017, 2(2), 14; doi:10.3390/data2020014 -
Abstract
There is considerable concern that the water resources of Central and Eastern Europe region can be adversely affected by climate change. Projections of future water balance and streamflow conditions can be obtained by forcing hydrological models with the output from climate models. In
[...] Read more.
There is considerable concern that the water resources of Central and Eastern Europe region can be adversely affected by climate change. Projections of future water balance and streamflow conditions can be obtained by forcing hydrological models with the output from climate models. In this study, we employed the SWAT hydrological model driven with an ensemble of nine bias-corrected EURO-CORDEX climate simulations to generate future hydrological projections for the Vistula and Odra basins in two future horizons (2024–2050 and 2074–2100) under two Representative Concentration Pathways (RCPs). The data set consists of three parts: (1) model inputs; (2) raw model outputs; (3) aggregated model outputs. The first one allows the users to reproduce the outputs or to create the new ones. The second one contains the simulated time series of 10 variables simulated by SWAT: precipitation, snow melt, potential evapotranspiration, actual evapotranspiration, soil water content, percolation, surface runoff, baseflow, water yield and streamflow. The third one consists of the multi-model ensemble statistics of the relative changes in mean seasonal and annual variables developed in a GIS format. The data set should be of interest of climate impact scientists, water managers and water-sector policy makers. In any case, it should be noted that projections included in this data set are associated with high uncertainties explained in this data descriptor paper. Full article
Figures

Figure 1

Open AccessData Descriptor
Open Access Article Processing Charges (OA APC) Longitudinal Study 2016 Dataset
Data 2017, 2(2), 13; doi:10.3390/data2020013 -
Abstract
This article documents Open access article processing charges (OA APC) Main 2016. This dataset was developed as part of a longitudinal study of the minority (about a third) of the fully open access journals that use the APC business model. APC data for
[...] Read more.
This article documents Open access article processing charges (OA APC) Main 2016. This dataset was developed as part of a longitudinal study of the minority (about a third) of the fully open access journals that use the APC business model. APC data for 2016, 2015, 2014, and 2013 are primarily obtained from publishers’ websites, a process that requires analytic skill as many publishers offer a diverse range of pricing options, including multiple currencies and/or differential pricing by article type, length or work involved and/or discounts for author contributions to editing or the society publisher or based on perceived ability to pay. This version of the dataset draws heavily from the work of Walt Crawford, and includes his entire 2011–2015 dataset; in particular Crawford’s work has made it possible to confirm “no publication fee” status for a large number of journals. DOAJ metadata for 2016 and 2014 and a 2010 APC sample provided by Solomon and Björk are part of the dataset. Inclusion of DOAJ metadata and article counts by Crawford and Solomon and Björk provide a basis for studies of factors such as journal size, subject, or country of publication that might be worth testing for correlation with business model and/or APC size. Full article
Open AccessData Descriptor
Ecological and Functional Traits in 99 Bird Species over a Large-Scale Gradient in Germany
Data 2017, 2(2), 12; doi:10.3390/data2020012 -
Abstract
A gap still exists in published data on variation of morphological and ecological traits for common bird species over a large area. To diminish this knowledge gap, we report here average values of 99 bird species from three sites in Germany from the
[...] Read more.
A gap still exists in published data on variation of morphological and ecological traits for common bird species over a large area. To diminish this knowledge gap, we report here average values of 99 bird species from three sites in Germany from the Biodiversity Exploratories on 24 ecological and functional traits. We present our own data on morphological and ecological traits of 28 common bird species and provide additional measurements for further species from published studies. This is a unique data set from live birds, which has not been published and is available neither from museum nor from any other collection in the presented coverage. Dataset: available as the supplementary file. Dataset license: CC-BY Full article
Figures

Figure 1

Open AccessErratum
Erratum: Morrison, H., et al. Open Access Article Processing Charges (OA APC) Longitudinal Study 2015 Preliminary Dataset
Data 2017, 2(1), 11; doi:10.3390/data2010011 -
Abstract The authors wish to make the following corrections to their paper [...] Full article
Open AccessData Descriptor
Herbarium of the Pontifical Catholic University of Paraná (HUCP), Curitiba, Southern Brazil
Data 2017, 2(1), 10; doi:10.3390/data2010010 -
Abstract
The main objective of this paper is to present the herbarium of the Pontifical Catholic University of Parana’s and its collection. The history of the HUCP had its beginning in the middle of the 1970s with the foundation of the Biology Museum that
[...] Read more.
The main objective of this paper is to present the herbarium of the Pontifical Catholic University of Parana’s and its collection. The history of the HUCP had its beginning in the middle of the 1970s with the foundation of the Biology Museum that gathered both botanical and zoological specimens. In April 1979 collections were separated and the HUCP was founded with preserved specimens of algae (green, red, and brown), fungi, and embryophytes. As of October 2016, the collection encompasses nearly 25,000 specimens from 4934 species, 1609 genera, and 297 families. Most of the specimens comes from the state of Paraná but there were also specimens from many Brazilian states and other countries, mainly from South America (Chile, Argentina, Uruguay, Paraguay, and Colombia) but also from other parts of the world (Cuba, USA, Spain, Germany, China, and Australia). Our collection includes 42 fungi, 258 gymnosperms, 299 bryophytes, 2809 pteridophytes, 3158 algae, 17,832 angiosperms, and only one type of Mimosa (Mimosa tucumensis Barneby ex Ribas, M. Morales & Santos-Silva—Fabaceae). We also have botanical education and education for sustainability programs for basic and high school students and training for teachers. Full article
Figures

Figure 1

Open AccessArticle
The Effectiveness of Geographical Data in Multi-Criteria Evaluation of Landscape Services †
Data 2017, 2(1), 9; doi:10.3390/data2010009 -
Abstract
The aim of the paper is to map and evaluate the state of the multifunctional landscape of the municipality of Naples (Italy) and its surroundings, through a Spatial Decision-Making support system (SDSS) combining geographic information system (GIS) and a multi-criteria method an analytic
[...] Read more.
The aim of the paper is to map and evaluate the state of the multifunctional landscape of the municipality of Naples (Italy) and its surroundings, through a Spatial Decision-Making support system (SDSS) combining geographic information system (GIS) and a multi-criteria method an analytic hierarchy process (AHP). We conceive a knowledge-mapping-evaluation (KME) framework in order to investigate the landscape as a complex system. The focus of the proposed methodology involving data gathering and processing. Therefore, both the authoritative and the unofficial sources, e.g., volunteered geographical information (VGI), are useful tools to enhance the information flow whenever quality assurance is performed. Thus, the maps of spatial criteria are useful for problem structuring and prioritization by considering the availability of context-aware data. Finally, the identification of landscape services (LS) and ecosystem services (ES) can improve the decision-making processes within a multi-stakeholders perspective involving the evaluation of the trade-off. The results show multi-criteria choropleth maps of the LS and ES with the density of services, the spatial distribution, and the surrounding benefits. Full article
Figures

Figure 1

Open AccessData Descriptor
Data on Healthy Food Accessibility in Amsterdam, The Netherlands
Data 2017, 2(1), 7; doi:10.3390/data2010007 -
Abstract
This data descriptor introduces data on healthy food supplied by supermarkets in the city of Amsterdam, The Netherlands. In addition to two neighborhood variables (i.e., share of autochthons and average housing values), the data comprises three street network-based accessibility measures derived from analyses
[...] Read more.
This data descriptor introduces data on healthy food supplied by supermarkets in the city of Amsterdam, The Netherlands. In addition to two neighborhood variables (i.e., share of autochthons and average housing values), the data comprises three street network-based accessibility measures derived from analyses using a geographic information system. Data are provided on a spatial micro-scale utilizing grid cells with a spatial resolution of 100 m. We explain how the data were collected and pre-processed, and how alternative analyses can be set up. To illustrate the use of the data, an example is provided using the R programming language. Full article
Figures

Figure 1

Open AccessArticle
An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data
Data 2017, 2(1), 8; doi:10.3390/data2010008 -
Abstract
Many clinical research datasets have a large percentage of missing values that directly impacts their usefulness in yielding high accuracy classifiers when used for training in supervised machine learning. While missing value imputation methods have been shown to work well with smaller percentages
[...] Read more.
Many clinical research datasets have a large percentage of missing values that directly impacts their usefulness in yielding high accuracy classifiers when used for training in supervised machine learning. While missing value imputation methods have been shown to work well with smaller percentages of missing values, their ability to impute sparse clinical research data can be problem specific. We previously attempted to learn quantitative guidelines for ordering cardiac magnetic resonance imaging during the evaluation for pediatric cardiomyopathy, but missing data significantly reduced our usable sample size. In this work, we sought to determine if increasing the usable sample size through imputation would allow us to learn better guidelines. We first review several machine learning methods for estimating missing data. Then, we apply four popular methods (mean imputation, decision tree, k-nearest neighbors, and self-organizing maps) to a clinical research dataset of pediatric patients undergoing evaluation for cardiomyopathy. Using Bayesian Rule Learning (BRL) to learn ruleset models, we compared the performance of imputation-augmented models versus unaugmented models. We found that all four imputation-augmented models performed similarly to unaugmented models. While imputation did not improve performance, it did provide evidence for the robustness of our learned models. Full article
Figures

Figure 2

Open AccessTechnical Note
Determination of Concentration of the Aqueous Lithium–Bromide Solution in a Vapour Absorption Refrigeration System by Measurement of Electrical Conductivity and Temperature
Data 2017, 2(1), 6; doi:10.3390/data2010006 -
Abstract
Lithium–bromide/water (LiBr/water) pairs are widely used as working medium in vapour absorption refrigeration systems where the maximum expected temperature and LiBr mass concentration in solution are usually 95 ℃ and 65%, respectively. Unfortunately, published data on the electrical conductivity of aqueous lithium–bromide solution
[...] Read more.
Lithium–bromide/water (LiBr/water) pairs are widely used as working medium in vapour absorption refrigeration systems where the maximum expected temperature and LiBr mass concentration in solution are usually 95 ℃ and 65%, respectively. Unfortunately, published data on the electrical conductivity of aqueous lithium–bromide solution are few and contradictory. The objective of this paper is to develop an empirical equation for the determination of the concentration of the aqueous lithium–bromide solution during the operation of the vapour absorption refrigeration system when the electrical conductivity and temperature of solution are known. The present study experimentally investigated the electrical conductivity of aqueous lithium–bromide solution at temperatures in the range from 25 ℃ to 95 ℃ and concentrations in the range from 45% to 65% by mass using a submersion toroidal conductivity sensor connected to a conductivity meter. The results of the tests have shown this method to be an accurate and efficient way to determine the concentration of aqueous lithium–bromide solution in the vapour absorption refrigeration system. Full article
Figures

Figure 1

Open AccessArticle
Learning Parsimonious Classification Rules from Gene Expression Data Using Bayesian Networks with Local Structure
Data 2017, 2(1), 5; doi:10.3390/data2010005 -
Abstract
The comprehensibility of good predictive models learned from high-dimensional gene expression data is attractive because it can lead to biomarker discovery. Several good classifiers provide comparable predictive performance but differ in their abilities to summarize the observed data. We extend a Bayesian Rule
[...] Read more.
The comprehensibility of good predictive models learned from high-dimensional gene expression data is attractive because it can lead to biomarker discovery. Several good classifiers provide comparable predictive performance but differ in their abilities to summarize the observed data. We extend a Bayesian Rule Learning (BRL-GSS) algorithm, previously shown to be a significantly better predictor than other classical approaches in this domain. It searches a space of Bayesian networks using a decision tree representation of its parameters with global constraints, and infers a set of IF-THEN rules. The number of parameters and therefore the number of rules are combinatorial in the number of predictor variables in the model. We relax these global constraints to learn a more expressive local structure with BRL-LSS. BRL-LSS entails a more parsimonious set of rules because it does not have to generate all combinatorial rules. The search space of local structures is much richer than the space of global structures. We design the BRL-LSS with the same worst-case time-complexity as BRL-GSS while exploring a richer and more complex model space. We measure predictive performance using Area Under the ROC curve (AUC) and Accuracy. We measure model parsimony performance by noting the average number of rules and variables needed to describe the observed data. We evaluate the predictive and parsimony performance of BRL-GSS, BRL-LSS and the state-of-the-art C4.5 decision tree algorithm, across 10-fold cross-validation using ten microarray gene-expression diagnostic datasets. In these experiments, we observe that BRL-LSS is similar to BRL-GSS in terms of predictive performance, while generating a much more parsimonious set of rules to explain the same observed data. BRL-LSS also needs fewer variables than C4.5 to explain the data with similar predictive performance. We also conduct a feasibility study to demonstrate the general applicability of our BRL methods on the newer RNA sequencing gene-expression data. Full article
Figures

Figure 1

Open AccessEditorial
Acknowledgement to Reviewers of Data in 2016
Data 2017, 2(1), 4; doi:10.3390/data2010004 -
Abstract The editors of Data would like to express their sincere gratitude to the following reviewers for assessing manuscripts in 2016.[...] Full article
Open AccessData Descriptor
Scanned Image Data from 3D-Printed Specimens Using Fused Deposition Modeling
Data 2017, 2(1), 3; doi:10.3390/data2010003 -
Abstract
This dataset provides high-resolution 2D scans of 3D printed test objects (dog-bone), derived from EN ISO 527-2:2012. The specimens are scanned in resolutions from 600 dpi to 4800 dpi utilising a Konica-Minolta bizHub 42 and Canon LiDE 210 scanner. The specimens are created
[...] Read more.
This dataset provides high-resolution 2D scans of 3D printed test objects (dog-bone), derived from EN ISO 527-2:2012. The specimens are scanned in resolutions from 600 dpi to 4800 dpi utilising a Konica-Minolta bizHub 42 and Canon LiDE 210 scanner. The specimens are created to research the influence of the infill-pattern orientation; The print orientation on the geometrical fidelity and the structural strength. The specimens are printed on a MakerBot Replicator 2X 3D-printer using yellow (ABS 1.75 mm Yellow, REC, Moscow, Russia) and purple ABS plastic (ABS 1.75 mm Pink Lion&Fox, Hamburg, Germany). The dataset consists of at least one scan per specimen with the measured dimensional characteristics. For this, software is created and described within this work. Specimens from this dataset are either scanned on blank white paper or on white paper with blue millimetre marking. The printing experiment contains a number of failed prints. Specimens that did not fulfil the expected geometry are scanned separately and are of lower quality due to the inability to scan objects with a non-flat surface. For a number of specimens printed sensor data is acquired during the printing process. This dataset consists of 193 specimen scans in PNG format of 127 objects with unadjusted raw graphical data and a corresponding, annotated post-processed image. Annotated data includes the detected object, its geometrical characteristics and file information. Computer extracted geometrical information is supplied for the images where automated geometrical feature extraction is possible. Full article
Figures

Figure 1

Open AccessArticle
How to Make Sense of Team Sport Data: From Acquisition to Data Modeling and Research Aspects
Data 2017, 2(1), 2; doi:10.3390/data2010002 -
Abstract
Automatic and interactive data analysis is instrumental in making use of increasing amounts of complex data. Owing to novel sensor modalities, analysis of data generated in professional team sport leagues such as soccer, baseball, and basketball has recently become of concern, with potentially
[...] Read more.
Automatic and interactive data analysis is instrumental in making use of increasing amounts of complex data. Owing to novel sensor modalities, analysis of data generated in professional team sport leagues such as soccer, baseball, and basketball has recently become of concern, with potentially high commercial and research interest. The analysis of team ball games can serve many goals, e.g., in coaching to understand effects of strategies and tactics, or to derive insights improving performance. Also, it is often decisive to trainers and analysts to understand why a certain movement of a player or groups of players happened, and what the respective influencing factors are. We consider team sport as group movement including collaboration and competition of individuals following specific rule sets. Analyzing team sports is a challenging problem as it involves joint understanding of heterogeneous data perspectives, including high-dimensional, video, and movement data, as well as considering team behavior and rules (constraints) given in the particular team sport. We identify important components of team sport data, exemplified by the soccer case, and explain how to analyze team sport data in general. We identify challenges arising when facing these data sets and we propose a multi-facet view and analysis including pattern detection, context-aware analysis, and visual explanation. We also present applicable methods and technologies covering the heterogeneous aspects in team sport data. Full article
Figures

Figure 1