Open AccessData Descriptor
RAE: The Rainforest Automation Energy Dataset for Smart Grid Meter Data Analysis
Data 2018, 3(1), 8; doi:10.3390/data3010008 -
Abstract
Datasets are important for researchers to build models and test how well their machine learning algorithms perform. This paper presents the Rainforest Automation Energy (RAE) dataset to help smart grid researchers test their algorithms that make use of smart meter data. This initial
[...] Read more.
Datasets are important for researchers to build models and test how well their machine learning algorithms perform. This paper presents the Rainforest Automation Energy (RAE) dataset to help smart grid researchers test their algorithms that make use of smart meter data. This initial release of RAE contains 1 Hz data (mains and sub-meters) from two residential houses. In addition to power data, environmental and sensor data from the house’s thermostat is included. Sub-meter data from one of the houses includes heat pump and rental suite captures, which is of interest to power utilities. We also show an energy breakdown of each house and show (by example) how RAE can be used to test non-intrusive load monitoring (NILM) algorithms. Full article
Figures

Figure 1

Open AccessArticle
Uttarakhand Medicinal Plants Database (UMPDB): A Platform for Exploring Genomic, Chemical, and Traditional Knowledge
Data 2018, 3(1), 7; doi:10.3390/data3010007 -
Abstract
Medicinal plants are the main natural pools for the primary health care system, ethno-medicine, as well as traditional Indian system of several medicines. Uttarakhand also known as ‘Herbal State’, is a rich source of medicinal plants and traditional medicinal knowledge. A great deal
[...] Read more.
Medicinal plants are the main natural pools for the primary health care system, ethno-medicine, as well as traditional Indian system of several medicines. Uttarakhand also known as ‘Herbal State’, is a rich source of medicinal plants and traditional medicinal knowledge. A great deal of information about medicinal plants of Uttarakhand is scattered in different forms. Although many medicinal plant databases are available, currently there is no cohesive manually curated database of medicinal plants widely distributed in Uttarakhand state. A comprehensive database has been developed, known as the Uttarakhand Medicinal Plants Database (UMPDB). UMPDB provides extensive information on botanical name, common name, taxonomy, genomic taxonomy id, habit, habitat, location in Uttarakhand, part use, medicinal use, genomic information (including number of nucleotides, proteins, ESTs), chemical information, and scientific literature. Annotated medicinal plants integrated in the current version of the database were collected from the existing books, databases, and available literature. The current version of UMPDB contains the 1127 records of medicinal plants which belong to 153 plant families distributed across 13 districts of Uttarakhand. The primary goal of developing this database is to provide traditional, genomic, and chemical descriptions of the medicinal plants exclusively found in various regions of Uttarakhand. We anticipate that embedded information in the database would help users to readily obtain desired information. Full article
Figures

Open AccessEditorial
Acknowledgement to Reviewers of Data in 2017
Data 2018, 3(1), 6; doi:10.3390/data3010006 -
Abstract
Peer review is an essential part in the publication process, ensuring that Data maintains high quality standards for its published papers [...] Full article
Open AccessData Descriptor
Thirty Thousand 3D Models from Thingiverse
Data 2018, 3(1), 5; doi:10.3390/data3010005 -
Abstract
This dataset contains files and geometrical analysis of 3D model data, acquired from the thingiverse online repository. More than thirty thousand stereolithography files (STL) were retrieved and analysed. The geometrical analysis of the respective models is presented along with model renderings in both
[...] Read more.
This dataset contains files and geometrical analysis of 3D model data, acquired from the thingiverse online repository. More than thirty thousand stereolithography files (STL) were retrieved and analysed. The geometrical analysis of the respective models is presented along with model renderings in both GIF and PNG format, and pre-sliced machine instructions as GCode. This dataset is intended to be used as a basis for further research in Additive Manufacturing (AM), such as 3D printing time estimation, printability assessment or slicing algorithm development. All files retrieved are user-generated, with the respective user and associated licence presented in the overview. The dataset was acquired between 2016 and 2017. Full article
Figures

Figure 1

Open AccessData Descriptor
Long-Term WiFi Fingerprinting Dataset for Research on Robust Indoor Positioning
Data 2018, 3(1), 3; doi:10.3390/data3010003 -
Abstract
WiFi fingerprinting, one of the most popular methods employed in indoor positioning, currently faces two major problems: lack of robustness to short and long time signal changes and difficult reproducibility of new methods presented in the relevant literature. This paper presents a WiFi
[...] Read more.
WiFi fingerprinting, one of the most popular methods employed in indoor positioning, currently faces two major problems: lack of robustness to short and long time signal changes and difficult reproducibility of new methods presented in the relevant literature. This paper presents a WiFi RSS (Received Signal Strength) database created to foster and ease research works that address the above-mentioned two problems. A trained professional took several consecutive fingerprints while standing at specific positions and facing specific directions. The consecutive fingerprints may enable the study of short-term signals variations. The data collection spanned over 15 months, and, for each month, one type of training datasets and five types of test datasets were collected. The measurements of a dataset type (training or test) were taken at the same positions and directions every month, in order to enable the analysis of long-term signal variations. The database is provided with supporting materials and software, which give more information about the collection environment and eases the database utilization, respectively. The WiFi measurements and the supporting materials are available at the Zenodo repository under the open-source MIT license. Full article
Figures

Open AccessArticle
CoeViz: A Web-Based Integrative Platform for Interactive Visualization of Large Similarity and Distance Matrices
Data 2018, 3(1), 4; doi:10.3390/data3010004 -
Abstract
Similarity and distance matrices are general data structures that describe reciprocal relationships between the objects within a given dataset. Commonly used methods for representation of these matrices include heatmaps, hierarchical trees, dimensionality reduction, and various types of networks. However, despite a well-developed foundation
[...] Read more.
Similarity and distance matrices are general data structures that describe reciprocal relationships between the objects within a given dataset. Commonly used methods for representation of these matrices include heatmaps, hierarchical trees, dimensionality reduction, and various types of networks. However, despite a well-developed foundation for the visualization of such representations, the challenge of creating an interactive view that would allow for quick data navigation and interpretation remains largely unaddressed. This problem becomes especially evident for large matrices with hundreds or thousands objects. In this work, we present a web-based platform for the interactive analysis of large (dis-)similarity matrices. It consists of four major interconnected and synchronized components: a zoomable heatmap, interactive hierarchical tree, scalable circular relationship diagram, and 3D multi-dimensional scaling (MDS) scatterplot. We demonstrate the use of the platform for the analysis of amino acid covariance data in proteins as part of our previously developed CoeViz tool. The web-platform enables quick and focused analysis of protein features, such as structural domains and functional sites. Full article
Figures

Figure 1

Open AccessData Descriptor
A Data Set of Human Body Movements for Physical Rehabilitation Exercises
Data 2018, 3(1), 2; doi:10.3390/data3010002 -
Abstract
The article presents University of Idaho-Physical Rehabilitation Movement Data (UI-PRMD), a publically available data set of movements related to common exercises performed by patients in physical rehabilitation programs. For the data collection, 10 healthy subjects performed 10 repetitions of different physical therapy movements
[...] Read more.
The article presents University of Idaho-Physical Rehabilitation Movement Data (UI-PRMD), a publically available data set of movements related to common exercises performed by patients in physical rehabilitation programs. For the data collection, 10 healthy subjects performed 10 repetitions of different physical therapy movements with a Vicon optical tracker and a Microsoft Kinect sensor used for the motion capturing. The data are in a format that includes positions and angles of full-body joints. The objective of the data set is to provide a basis for mathematical modeling of therapy movements, as well as for establishing performance metrics for evaluation of patient consistency in executing the prescribed rehabilitation exercises. Full article
Figures

Figure 1a

Open AccessData Descriptor
World Ocean Isopycnal Level Absolute Geostrophic Velocity (WOIL-V) Inverted from GDEM with the P-Vector Method
Data 2018, 3(1), 1; doi:10.3390/data3010001 -
Abstract
Three-dimensional dataset of world ocean climatological annual and monthly mean absolute geostrophic velocity in isopycnal level (called WOIL-V) has been produced from the United States (U.S.) Navy’s Generalized Digital Environmental Model (GDEM) temperature and salinity fields (open access from the website http://data.nodc.noaa.gov/cgi-bin/iso?id=gov.noaa.nodc:9600094)
[...] Read more.
Three-dimensional dataset of world ocean climatological annual and monthly mean absolute geostrophic velocity in isopycnal level (called WOIL-V) has been produced from the United States (U.S.) Navy’s Generalized Digital Environmental Model (GDEM) temperature and salinity fields (open access from the website http://data.nodc.noaa.gov/cgi-bin/iso?id=gov.noaa.nodc:9600094) using the P-vector method. The data have horizontal resolution of 0.5° × 0.5°, and 222 isopycnal-levels. The total 13 data files include annual and monthly mean values. The WOIL-V is the only dataset of absolute geostrophic velocity in isopycnal level compatible to the GDEM (T, S) fields, and provides background ocean currents for oceanographic and climatic studies, especially in ocean modeling with the isopycnal coordinate system. Full article
Figures

Figure 1

Open AccessArticle
Investigating the Evolution of Linkage Dynamics among Equity Markets Using Network Models and Measures: The Case of Asian Equity Market Integration
Data 2017, 2(4), 41; doi:10.3390/data2040041 -
Abstract
The state of cross-market linkage structures and its stability over varying time-periods play a key role in the performance of international diversified portfolios. There has been an increasing interest of global investors in emerging capital markets in the Asian region. In this setting,
[...] Read more.
The state of cross-market linkage structures and its stability over varying time-periods play a key role in the performance of international diversified portfolios. There has been an increasing interest of global investors in emerging capital markets in the Asian region. In this setting, an investigation into the temporal dynamics of cross-market linkage structures becomes significant for the selection and optimal allocation of securities in an internationally-diversified portfolio. In the quest for this, in the current study, weighted network models along with network metrics are employed to decipher the underlying cross-market linkage structures among Asian markets. The study analyses the daily return data of fourteen major Asian indices for a period of 14 years (2002–2016). The topological properties of the network are computed using centrality measures and measures of influence strength and are investigated over temporal scales. In particular, the overall influence strengths and India-specific influence strengths are computed and examined over a temporal scale. Threshold filtering is also performed to characterize the dynamics related to the linkage structure of these networks. The impacts of the 2008 financial crisis on the linkage structural patterns of these equity networks are also investigated. The key findings of this study include: a set of central and peripheral indices, the evolution of the linkage structures over the 2002–2016 period and the linkage dynamics during times of market stress. Mainly, the set of indices possessing influence over the Asian region in general and the Indian market in particular is also identified. The findings of this study can be utilized in effective systemic risk management and for the selection of an optimally-diversified portfolio, resilient to system-level shocks. Full article
Figures

Figure 1

Open AccessData Descriptor
GasLib—A Library of Gas Network Instances
Data 2017, 2(4), 40; doi:10.3390/data2040040 -
Abstract
The development of mathematical simulation and optimization models and algorithms for solving gas transport problems is an active field of research. In order to test and compare these models and algorithms, gas network instances together with demand data are needed. The goal of
[...] Read more.
The development of mathematical simulation and optimization models and algorithms for solving gas transport problems is an active field of research. In order to test and compare these models and algorithms, gas network instances together with demand data are needed. The goal of GasLib is to provide a set of publicly available gas network instances that can be used by researchers in the field of gas transport. The advantages are that researchers save time by using these instances and that different models and algorithms can be compared on the same specified test sets. The library instances are encoded in an XML (extensible markup language) format. In this paper, we explain this format and present the instances that are available in the library. Full article
Figures

Figure 1

Open AccessArticle
Congestion Quantification Using the National Performance Management Research Data Set
Data 2017, 2(4), 39; doi:10.3390/data2040039 -
Abstract
Monitoring of transportation system performance is a key element of any transportation operation and planning strategy. Estimation of dependable performance measures relies on analysis of large amounts of traffic data, which are often expensive and difficult to gather. National databases can assist in
[...] Read more.
Monitoring of transportation system performance is a key element of any transportation operation and planning strategy. Estimation of dependable performance measures relies on analysis of large amounts of traffic data, which are often expensive and difficult to gather. National databases can assist in this regard, but challenges still remain with respect to data management, accuracy, storage, and use for performance monitoring. In an effort to address such challenges, this paper showcases a process that utilizes the National Performance Management Research Data Set (NPMRDS) for generating performance measures for congestion monitoring applications in the Birmingham region. The capabilities of the relational database management system (RDBMS) are employed to manage the large amounts of NPMRDS data. Powerful visual maps are developed using GIS software and used to illustrate congestion location, extent and severity. Travel time reliability indices are calculated and utilized to quantify congestion, and congestion intensity measures are developed and employed to rank and prioritize congested segments in the study area. The process for managing and using big traffic data described in the Birmingham case study is a great example that can be replicated by small and mid-size Metropolitan Planning Organizations to generate performance-based measures and monitor congestion in their jurisdictions. Full article
Figures

Figure 1

Open AccessData Descriptor
Antibody Exchange: Information Extraction of Biological Antibody Donation and a Web-Portal to Find Donors and Seekers
Data 2017, 2(4), 38; doi:10.3390/data2040038 -
Abstract
Bio-molecular reagents, like antibodies that are required in experimental biology are expensive and their effectiveness, among other things, is critical to the success of the experiment. Although such resources are sometimes donated by one investigator to another through personal communication between the two,
[...] Read more.
Bio-molecular reagents, like antibodies that are required in experimental biology are expensive and their effectiveness, among other things, is critical to the success of the experiment. Although such resources are sometimes donated by one investigator to another through personal communication between the two, there is no previous study to our knowledge on the extent of such donations, nor a central platform that directs resource seekers to donors. In this paper, we describe, to our knowledge, a first attempt at building a web-portal titled Antibody Exchange (or more general ‘Bio-Resource Exchange’) that attempts to bridge this gap between resource seekers and donors in the domain of experimental biology. Users on this portal can request for or donate antibodies, cell-lines, and DNA Constructs. This resource could also serve as a crowd-sourced database of resources for experimental biology. Further, we also studied the extent of antibody donations by mining the acknowledgement sections of scientific articles. Specifically, we extracted the name of the donor, his/her affiliation, and the name of the antibody for every donation by parsing the acknowledgements sections of articles. To extract annotations at this level, we adopted two approaches—a rule based algorithm and a bootstrapped pattern learning algorithm. The algorithms extracted donor names, affiliations, and antibody names with average accuracies of 57% and 62%, respectively. We also created a dataset of 50 expert-annotated acknowledgements sections that will serve as a gold standard dataset to evaluate extraction algorithms in the future. Full article
Figures

Figure 1

Open AccessArticle
Regionalization of a Landscape-Based Hazard Index of Malaria Transmission: An Example of the State of Amapá, Brazil
Data 2017, 2(4), 37; doi:10.3390/data2040037 -
Abstract
Identifying and assessing the relative effects of the numerous determinants of malaria transmission, at different spatial scales and resolutions, is of primary importance in defining control strategies and reaching the goal of the elimination of malaria. In this context, based on a knowledge-based
[...] Read more.
Identifying and assessing the relative effects of the numerous determinants of malaria transmission, at different spatial scales and resolutions, is of primary importance in defining control strategies and reaching the goal of the elimination of malaria. In this context, based on a knowledge-based model, a normalized landscape-based hazard index (NLHI) was established at a local scale, using a 10 mspatial resolution forest vs. non-forest map, landscape metrics and a spatial moving window. Such an index evaluates the contribution of landscape to the probability of human-malaria vector encounters, and thus to malaria transmission risk. Since the knowledge-based model is tailored to the entire Amazon region, such an index might be generalized at large scales for establishing a regional view of the landscape contribution to malaria transmission. Thus, this study uses an open large-scale land use and land cover dataset (i.e., the 30 m TerraClass maps) and proposes an automatic data-processing chain for implementing NLHI at large-scale. First, the impact of coarser spatial resolution (i.e., 30 m) on NLHI values was studied. Second, the data-processing chain was established using R language for customizing the spatial moving window and computing the landscape metrics and NLHI at large scale. This paper presents the results in the State of Amapá, Brazil. It offers the possibility of monitoring a significant determinant of malaria transmission at regional scale. Full article
Figures

Figure 1

Open AccessData Descriptor
Database of Himalayan Plants Based on Published Floras during a Century
Data 2017, 2(4), 36; doi:10.3390/data2040036 -
Abstract
The Himalaya is the largest mountain range in the world, spanning approximately ten degrees of latitude and elevation between 100 m asl to the highest mountain peak on earth. The region varies in plant species richness, being highest in the biodiversity hotspot of
[...] Read more.
The Himalaya is the largest mountain range in the world, spanning approximately ten degrees of latitude and elevation between 100 m asl to the highest mountain peak on earth. The region varies in plant species richness, being highest in the biodiversity hotspot of Eastern Himalaya and declining to the North-Western parts of the Himalaya. We examined all published floras (31 floras in 42 volumes spanning the years 1903–2014) from the Indian Himalayan region, Nepal, and Bhutan to compile a comprehensive checklist of all gymnosperms and angiosperms. A total of 10,503 species representing 240 families and 2322 genera are reported. We evaluated all the botanical names reported in the floras for their updated taxonomy and excluded >3000 synonyms. Additionally, we identified 1134 species reported in these floras that presently remain taxonomically unresolved and 160 species with missing information in the global plant database (The Plant List, 2013). This is the most comprehensive estimate of plant species diversity in the Himalaya. Full article
Figures

Figure 1

Open AccessArticle
Earth Observation for Citizen Science Validation, or Citizen Science for Earth Observation Validation? The Role of Quality Assurance of Volunteered Observations
Data 2017, 2(4), 35; doi:10.3390/data2040035 -
Abstract
Environmental policy involving citizen science (CS) is of growing interest. In support of this open data stream of information, validation or quality assessment of the CS geo-located data to their appropriate usage for evidence-based policy making needs a flexible and easily adaptable data
[...] Read more.
Environmental policy involving citizen science (CS) is of growing interest. In support of this open data stream of information, validation or quality assessment of the CS geo-located data to their appropriate usage for evidence-based policy making needs a flexible and easily adaptable data curation process ensuring transparency. Addressing these needs, this paper describes an approach for automatic quality assurance as proposed by the Citizen OBservatory WEB (COBWEB) FP7 project. This approach is based upon a workflow composition that combines different quality controls, each belonging to seven categories or “pillars”. Each pillar focuses on a specific dimension in the types of reasoning algorithms for CS data qualification. These pillars attribute values to a range of quality elements belonging to three complementary quality models. Additional data from various sources, such as Earth Observation (EO) data, are often included as part of the inputs of quality controls within the pillars. However, qualified CS data can also contribute to the validation of EO data. Therefore, the question of validation can be considered as “two sides of the same coin”. Based on an invasive species CS study, concerning Fallopia japonica (Japanese knotweed), the paper discusses the flexibility and usefulness of qualifying CS data, either when using an EO data product for the validation within the quality assurance process, or validating an EO data product that describes the risk of occurrence of the plant. Both validation paths are found to be improved by quality assurance of the CS data. Addressing the reliability of CS open data, issues and limitations of the role of quality assurance for validation, due to the quality of secondary data used within the automatic workflow, are described, e.g., error propagation, paving the route to improvements in the approach. Full article
Figures

Figure 1

Open AccessData Descriptor
The #BTW17 Twitter Dataset–Recorded Tweets of the Federal Election Campaigns of 2017 for the 19th German Bundestag
Data 2017, 2(4), 34; doi:10.3390/data2040034 -
Abstract
The German Bundestag elections are the most important elections in Germany. This dataset comprises Twitter interactions related to German politicians of the most important political parties over several months in the (pre-)phase of the German federal election campaigns in 2017. The Twitter accounts
[...] Read more.
The German Bundestag elections are the most important elections in Germany. This dataset comprises Twitter interactions related to German politicians of the most important political parties over several months in the (pre-)phase of the German federal election campaigns in 2017. The Twitter accounts of more than 360 politicians were followed for four months. The collected data comprise a sample of approximately 10 GB of Twitter raw data, and they cover more than 120,000 active Twitter users and more than 1,200,000 recorded tweets. Even without sophisticated data analysis techniques, it was possible to deduce a likely political party proximity for more than half of these accounts simply by looking at the re-tweet behavior. This might be of interest for innovative data-driven party campaign strategists in the future. Furthermore, it is observable, that, in Germany, supporters and politicians of populist parties make use of Twitter much more intensively and aggressively than supporters of other parties. Furthermore, established left-wing parties seem to be more active on Twitter than established conservative parties. The dataset can be used to study how political parties, their followers and supporters make use of social media channels in political election campaigns and what kind of content is shared. Full article
Figures

Figure 1

Open AccessArticle
Temporal Statistical Analysis of Degree Distributions in an Undirected Landline Phone Call Network Graph Series
Data 2017, 2(4), 33; doi:10.3390/data2040033 -
Abstract
This article aims to provide new results about the intraday degree sequence distribution considering phone call network graph evolution in time. More specifically, it tackles the following problem. Given a large amount of landline phone call data records, what is the best way
[...] Read more.
This article aims to provide new results about the intraday degree sequence distribution considering phone call network graph evolution in time. More specifically, it tackles the following problem. Given a large amount of landline phone call data records, what is the best way to summarize the distinct number of calling partners per client per day? In order to answer this question, a series of undirected phone call network graphs is constructed based on data from a local telecommunication source in Albania. All network graphs of the series are simplified. Further, a longitudinal temporal study is made on this network graphs series related to the degree distributions. Power law and log-normal distribution fittings on the degree sequence are compared on each of the network graphs of the series. The maximum likelihood method is used to estimate the parameters of the distributions, and a Kolmogorov–Smirnov test associated with a p-value is used to define the plausible models. A direct distribution comparison is made through a Vuong test in the case that both distributions are plausible. Another goal was to describe the parameters’ distributions’ shape. A Shapiro-Wilk test is used to test the normality of the data, and measures of shape are used to define the distributions’ shape. Study findings suggested that log-normal distribution models better the intraday degree sequence data of the network graphs. It is not possible to say that the distributions of log-normal parameters are normal. Full article
Figures

Figure 1

Open AccessData Descriptor
Wi-Fi Crowdsourced Fingerprinting Dataset for Indoor Positioning
Data 2017, 2(4), 32; doi:10.3390/data2040032 -
Abstract
Benchmark open-source Wi-Fi fingerprinting datasets for indoor positioning studies are still hard to find in the current literature and existing public repositories. This is unlike other research fields, such as the image processing field, where benchmark test images such as the Lenna image
[...] Read more.
Benchmark open-source Wi-Fi fingerprinting datasets for indoor positioning studies are still hard to find in the current literature and existing public repositories. This is unlike other research fields, such as the image processing field, where benchmark test images such as the Lenna image or Face Recognition Technology (FERET) databases exist, or the machine learning field, where huge datasets are available for example at the University of California Irvine (UCI) Machine Learning Repository. It is the purpose of this paper to present a new openly available Wi-Fi fingerprint dataset, comprised of 4648 fingerprints collected with 21 devices in a university building in Tampere, Finland, and to present some benchmark indoor positioning results using these data. The datasets and the benchmarking software are distributed under the open-source MIT license and can be found on the EU Zenodo repository. Full article
Figures

Figure 1

Open AccessArticle
An Improved Power Law for Nonlinear Least-Squares Fitting?
Data 2017, 2(3), 31; doi:10.3390/data2030031 -
Abstract
Models based on a power law are prevalent in many areas of study. When regression analysis is performed on data sets modeled by a power law, the traditional model uses a lead coefficient. However, the proposed model replaces the lead coefficient with a
[...] Read more.
Models based on a power law are prevalent in many areas of study. When regression analysis is performed on data sets modeled by a power law, the traditional model uses a lead coefficient. However, the proposed model replaces the lead coefficient with a scaling parameter and reduces uncertainties in best-fit parameters for data sets with exponents close to 3. This study extends previous work by testing each model for a range of parameters. Data sets with known values of scaling parameter and exponent were generated by adding normally distributed random errors with controlled mean and standard deviations to underlying power laws. These data sets were then analyzed for both forms of the power law. For the scaling parameter, the proposed model provided smaller errors in 96/180 cases and smaller uncertainties in 88/180 cases. In most remaining cases, the traditional model provided smaller errors or uncertainties. Examination of conditions indicates that the proposed law has potential in select cases, but due to ambiguity in the conditions which favor one model over the other, an approach similar to the one in this study is encouraged for determining which model will offer reduced errors and uncertainties in data sets where additional accuracy is desired. Full article
Figures

Figure 1

Open AccessArticle
Estimating Cost Savings from Early Cancer Diagnosis
Data 2017, 2(3), 30; doi:10.3390/data2030030 -
Abstract
We estimate treatment cost-savings from early cancer diagnosis. For breast, lung, prostate and colorectal cancers and melanoma, which account for more than 50% of new incidences projected in 2017, we combine published cancer treatment cost estimates by stage with incidence rates by stage
[...] Read more.
We estimate treatment cost-savings from early cancer diagnosis. For breast, lung, prostate and colorectal cancers and melanoma, which account for more than 50% of new incidences projected in 2017, we combine published cancer treatment cost estimates by stage with incidence rates by stage at diagnosis. We extrapolate to other cancer sites by using estimated national expenditures and incidence rates. A rough estimate for the U.S. national annual treatment cost-savings from early cancer diagnosis is in 11 digits. Using this estimate and cost-neutrality, we also estimate a rough upper bound on the cost of a routine early cancer screening test. Full article