1. Introduction
According to the International Agency for Research on Cancer, more than 19.2 million new cases of cancer were registered in 2020. Accounting for almost 10 million deaths, it is one of the world’s leading causes of death [
1]. The survival rates are influenced by many factors, e.g., the type of cancer or the stage at which it was diagnosed. For prostate cancer, as an example, the 1-year survival rate is 97% (5-year survival rate 87%), while it is only 41% for lung cancer (5-year survival rate 17%; data for the United Kingdom [
2]). Similarly, the probability of recurrence considerably differs: for childhood acute myeloid leukemia (AML), it is reported to vary between 9% and 29% [
3] and, for melanoma, even between 5% and 67% (within 180 days) dependent on stage [
4].
The causes of cancer are diverse. In addition to genetic predisposition, environmental factors play a major role. Exposition to physical, chemical, or biological carcinogens favor the development of mutations, which can lead to the transformation of normal cells to tumor cells [
5]. For many types of cancer, the precise mutational characterization of a tumor is essential, influencing diagnosis, prognosis, and therapy. In the case of myeoldysplastic syndromes (MDS), the Revised International Prognostic Scoring System (IPSS-R)—considering, among others, the presence of cytogenetic abnormalities—is commonly used to stratify patients for high vs. low risk of AML transformation [
6]. A recent study conducted by [
7] provides evidence that the further subgrouping of low-risk MDS patients can be performed based on their mutational profile. For Burkitt lymphoma (BL), there is evidence that the presence of a double-hit event, that is, two variants affecting the gene
TP53, is associated with relapse [
8]. For acute lymphoblastic leukemia (ALL), therapy with a tyrosine kinase inhibitor, such as Imatinib, is assumed to be highly beneficial in the presence of an BCR:ABL translocation [
9].
In addition to characterizing a tumor with respect to the presence or absence of certain variants at the time of initial diagnosis, monitoring the evolution of the mutational profile in the course of a disease is equally important. It allows for the early detection of newly developing, highly aggressive subclones that might be resistant to therapy and can potentially lead to relapse [
8] (see
Figure 1).
To determine a tumor’s mutational profile and its development over time, two main approaches can be distinguished: bulk and single-cell DNA sequencing (scDNA-seq). In terms of bulk sequencing, small variants are determined by, e.g., targeted or whole-exome sequencing (WES). The cancer cell fraction (CCF) estimated for each variant represents its average abundance across all the analyzed cells. The information on which variants co-occur within each cell is not directly available from the data. Instead, deconvolution has to be performed to decipher the underlying clonal populations and, subsequently, reconstruct clonal evolution. High intra-tumor heterogeneity [
10,
11] poses a major challenge for this approach. Unique results cannot always be retrieved. On the contrary, single-cell sequencing allows—in theory—to precisely determine the mutation profile of every analyzed cell. Technical challenges, e.g., allelic drop-out or amplification bias leading to low sensitivity [
12,
13], however, still hamper valid variant calling and the subsequent reconstruction of clonal evolution using this technique.
Apart from small variants, structural and copy number variants (CNVs) play a major role in clonal evolution. In addition to next-generation sequencing techniques, these variants are commonly detected using microarrays, fluorescence in situ hybridization (FISH), and/or karyotypin. While microarrays, such as SNP-arrays, only allow for a rough estimation of CCFs on the bulk level, FISH and karyotyping provide data on single-cell level. However, the resolution is considerably lower compared to scDNA-seq (100–200 interphase nucleoli for FISH, 10–25 metaphases for karyotyping). However, the valid reconstruction of clonal evolution is only possible if the data on both small and large variants are integrated and jointly evaluated [
14,
15].
As clonal evolution thrives to analyze cancer development over time, the samples are usually collected at several time points within the course of the disease. Solid tumors additionally hold the option for collecting multi-regional samples—within one tumor and/or at different sites (e.g., primary tumor, lymph node, and metastasis).
A plethora of approaches that perform the reconstruction of clonal evolution fully automatically is available. In a previous study, we aimed at assessing the performance of tools for the analysis of bulk sequencing data, considering two sets of well-characterized real data from the patients with MDS [
16] and BL [
8]. Our analysis indicated that the performance of the currently available approaches does not warrant their safe usage in research or clinical routine. The reasons for these observations, however, remained unclear. The data neither allowed us to identify those characteristics that were primarily responsible for the unreliable performance of the evaluated tools, nor to develop countermeasures [
17].
To the best of our knowledge, a systematic evaluation of the tools reconstructing clonal evolution, exploring their strengths and weaknesses in relation to, e.g., the number of clones or time points, has not been performed yet. To provide a means for systematic evaluation, we developed clevRsim—a simulation approach for clonal evolution in R. clevRsim has been designed to simulate single-nucleotide variants (SNVs) in bulk sequencing data as well as (overlapping) CNVs on the basis of a user-definable clonal evolution pattern. Simulated data can then be used as an input for tools performing variant clustering and clonal evolution tree reconstruction. Considering different levels of difficulty, we simulated 88 data sets with 10 patients each. We perform a detailed systematic evaluation of nine tools for variant clustering and four tools for clonal evolution tree reconstruction that were previously identified by systematic search.
4. Discussion
In this work, we performed a systematic evaluation of tools for variant clustering and clonal evolution tree reconstruction. To our knowledge, a comparable thorough analysis has not been performed yet. With the help of simulated data sets generated with our novel approach clevRsim, we analyzed the influence of a varying number of time points, clones, SNVs, coverage, CNVs, and the underlying model of evolution.
Our results indicate that a high number of clones poses a major challenge for all variant clustering tools. This observation is in line with our previous results on real dat: the correct clustering could not be automatically determined for any patient characterized by >3 clones [
17]. Regarding clinical practice, this observation implies a considerable challenge. Several studies reported on the high level of intra-tumor heterogeneity to be observed in many cancers, e.g., [
10]. However, heterogeneity is expected to hamper the valid automatic variant clustering, leading to an underestimated number of clusters.
For a varying number of time points, diverse tool-dependent effects could be observed. A majority of tools showed poor performance in the presence of only one time point. With respect to clinical practice, this observation may be similarly challenging. At the beginning of a disease, only limited amount of data are available. However, in order to use clonal evolution analysis to support treatment decisions, valid results—even in the presence of only one time point—are needed.
A clear negative influence on the tools’ performance could be observed for low coverage. This challenge may be overcome by increasingly performing deep targeted sequencing or—in case information on commonly mutated hotspot genes is lacking—high-coverage WES. However, it was also observed that a very high coverage of 2000x again leads to performance degradation. The detailed analysis of the results showed that variants of one cluster are often split among multiple clusters for all tools. It appears likely that the reason for this observation is the “over-interpretation” of small differences in the estimated CCFs. However, unexpectedly, we could also identify some clusters, partly showing differences of CCF, being merged. This observation remains unclear as higher coverage is expected to lead to higher confidence in the estimated CCFs.
A particularly negative impact on variant clustering could be observed for deletions or duplications overlapping SNVs. For all tools, dissimilarity between the proposed and the true clustering was considerably increased. Several issues underlie this observation: First, a majority of the tools just allow for defining the underlying copy number for every SNV and not the fraction of cells characterized by a certain copy number. Second, the scenario of overlap is not considered despite having a major influence on the genotype and requiring an adjusted calculation of the CCF (formula provided in [
14]). Third, none of the considered tools are capable of jointly clustering CNVs and SNVs in the case of lacking overlap.
Reconstructing clonal evolution trees on the basis of correctly clustered variants is not free from flaws either. The presence of a high number of clones as well as branched independent evolution increased the dissimilarity between the correct and the reported trees. When introducing their tool, SCHISM, [
36] observed similar results: in the presence of many clones, SCHISM only succeeded in reporting the correct tree under ideal conditions—many time points, high coverage, and high purity of the samples. However, it should be mentioned that [
36] just determined whether the correct tree was upon all trees reported by SCHISM. For 55% of all the simulated patients in our study, the tool reported an ambiguous consensus tree.
Furthermore, all the results we observed on tree reconstruction in our study were generated in an optimized setting. Commonly, the output of variant clustering tools is provided as input, which—as our results showed—does not necessarily match the correct clustering. Thus, moving from a list of variants to properly reconstructed clonal evolution, errors are expected to multiply.
To provide a means for developing new optimized algorithms, our simulation tool clevRsim is expected to provide a valuable input. Randomly simulating phylogeny and variants, our approach is able to generate an unlimited number of valid clonal evolution patterns. As future work, we plan to extend clevRsim considering a user-defined input on clinical parameters, which is expected to modulate the birth and death rates of simulated cell populations. For example, a therapy being applied is likely leading to a general decrease in CCFs and an eradication of some cell populations, while relapse is probably preceded by several new (branching) clones. Furthermore, the simulation of driver vs. passenger mutations and their effect on CCF development will be explored. As a result, the future updated version of clevRsim will allow for generating more specific clonal evolution patterns automatically—without the need for manual fine-tuning that exists at present.
It can be discussed why we focused our evaluation of bioinformatics approaches for reconstructing clonal evolution on the—compared to scDNA-seq—relatively old technique of bulk sequencing. Despite bearing the potential to revolutionize the field of clonal evolution, scDNA-seq is currently still characterized by major technical and practical challenges. Allelic drop-outs and amplification bias cause low sensitivity for detecting variants. Sequencing costs are high, which hampers its use in large study cohorts or even clinical routine. Additionally, missing suitable material prohibits the retrospective analysis of samples. As a consequence, we expect that the reconstruction of clonal evolution on the basis of bulk sequencing data will continue to play a major role within the next decade.