Detection of Microsatellite Instability: State of the Art and Future Applications in Circulating Tumour DNA (ctDNA)

Simple Summary Microsatellite instability (MSI) is a molecular fingerprint for defects in the mismatch repair system (dMMR) and is associated with higher risks of cancers. MSI/dMMR tumours are characterized by the accumulation of mutations throughout the genome, and particularly in microsatellite (MS) DNA repeat sequences. MSI stands as a major biomarker for familial cancer risk assessment, cancer prognosis, and therapeutic choices. Standard-of-care classification of MSI/dMMR tumours is most frequently achieved using immunohistochemistry or PCR-based assay directed against a set of five MS regions. However, novel molecular methods based on tumour tissue or plasma samples have been developed and could enter in the future trends of MSI testing. Here, we provide insights into these emerging approaches and discuss their advantages and limitations. Abstract Microsatellite instability (MSI) is a molecular scar resulting from a defective mismatch repair system (dMMR) and associated with various malignancies. MSI tumours are characterized by the accumulation of mutations throughout the genome and particularly clustered in highly repetitive microsatellite (MS) regions. MSI/dMMR status is routinely assessed in solid tumours for the initial screening of Lynch syndrome, the evaluation of cancer prognosis, and treatment decision-making. Currently, pentaplex PCR-based methods and MMR immunohistochemistry on tumour tissue samples are the standard diagnostic methods for MSI/dMMR. Other tissue methods such as next-generation sequencing or real-time PCR-based systems have emerged and represent viable alternatives to standard MSI testing in specific settings. The evolution of the standard molecular techniques has offered the opportunity to extend MSI determination to liquid biopsy based on the analysis of cell-free DNA (cfDNA) in plasma. This review aims at synthetizing the standard and emerging techniques used on tumour tissue samples for MSI/dMMR determination. We also provide insights into the MSI molecular techniques compatible with liquid biopsy and the potential clinical consequences for patients with solid cancers.


Introduction
The mismatch repair (MMR) machinery is an evolutionarily conserved system responsible for the preservation of DNA homeostasis in cells [1]. The MMR system is composed of the hMutS heterodimers (MSH2/MSH6 and MSH2/MSH3 complexes) that ensure the specific recognition of mispaired nucleotides and small insertion-deletions generated during the replication or recombination processes [2] or resulting from DNA damage [3]. These complexes initiate the repair and recruit the hMutL heterodimers (hMLH1/hPMS2, hMLH1/hPMS1 and hMLH1/hMLH3) to catalyze the mispair excision and error-free resynthesis using the remaining DNA strand as a template for the DNA polymerase [4]. Genetic and epigenetic inactivations of MMR genes cause MMR defects (dMMR) and

MSI Standard Reference Testing
Immunohistochemistry (IHC) and PCR-based assays performed on tumour tissue samples (biopsy or surgical resection samples) account for gold standard in determination of MSI/dMMR status (Table 1). IHC shows high sensitivity and specificity in the most frequent LS-associated cancers when exploring the expression of either the 4 main MMR proteins (MLH1, MSH2, MSH6, PMS2) or only MSH6/PMS2 proteins [34,35]. Based on these results, dMMR, as indicated by IHC, leads to reflex testing for LS [36]. Yet, MSI-PCR approaches based on the PCR amplification of MS regions followed by capillary electrophoresis (PCR-CE) have been demonstrated as a reliable alternative to the historical IHC-based testing. Notably, MSI-PCR allows retrieval of cases with preanalytical issues or indeterminate results in IHC as well as IHC false negative results due to rare non-truncating missense mutations in the MMR genes associated with intact antigenicity [37]. The clinical interest of MSI-PCR has particularly increased with the development of pentaplex PCR panels integrating 5 mononucleotide and quasi-monomorphic MS regions (including BAT-25 and BAT-26), which improved the assay sensitivity and obviated the need for analyzing paired normal tissue for MS length comparison [38][39][40]. The Promega ® MSI analysis system appears as one of the most investigated commercial PCR assay using 5 mononucleotide markers (BAT-25, BAT-26, MONO-27, NR-21 and NR-24) for MSI typing and 2 pentanucleotide regions (PentaC and PentaD) to detect sample mix-up or contamination [41]. Currently, IHC and MSI-PCR methods are nearly equally proficient in identifying MSI/dMMR colorectal, endometrial, and probably gastric cancers [42][43][44][45][46]. The concordance between the two techniques appears less certain in other cancers and may differ depending on the tumour tissue of origin [47,48]. In practice, these techniques are complementary since IHC identifies the cause of dMMR while PCR explores the consequences at the nucleic acids level on MS length. Because the theranostic implications of MSI are substantial and neither approach is able to detect all MSI/dMMR cases, it has been proposed that both assays should be applied simultaneously or sequentially in order to avoid MSI misinterpretation [36,49,50].

Other MSI Approaches on Tumour Tissue Samples
Other molecular approaches based on the analysis of tumour tissue samples have recently emerged with the aim of improving sensitivity and specificity compared to conventional MSI/dMMR testing. They may represent present or future valuable options to conventional MSI testing.

Histopathology-Based Approaches
In patients with unresectable or metastatic cancers, biopsy or surgical resection specimens are often difficult to obtain and cytology of body fluids can be the only sample available for diagnosis. In this context, Jacobi et al. evaluated the feasibility to determine MSI status in cytologic material from patients with colorectal or endometrial cancers [52]. IHC staining was performed on cell block sections prepared from cytologic specimens and results were concordant in 85% cases (45/53) with IHC/MSI-PCR results from matched surgical samples. Inconclusive and false-negative cell block results arose in 11% and 4% cases, respectively, resulting from low tumour cell content, staining in cells indefinite for tumour, staining heterogeneity in tumour cells, or lack of internal control staining. In the absence of surgical specimens available, cytologic samples could thus represent a promising source of material for dMMR testing. Particular attention is, however, needed for the interpretation of the results.
Moving into the era of universal MSI/dMMR testing, there is a growing need for faster, easier-to-perform, and more affordable approaches than the conventional methods. Artificial-intelligence methods have been recently proposed to directly predict MSI status from routine haematoxylin and eosin-stained slides, which are routinely available for almost all patients with cancer [53][54][55][56][57]. Each tumour image is segmented into thousands of tiles in which deep learning models assign MSI-score based on different features. For each slide, MSI classification is inferred based on the score of the majority of the tiles [54]. Studies support that machine learning algorithms are useful to predict MSI status, however they require large cohorts for training and validation and are only relevant on cohorts with similar patient and sample characteristics than the training datasets [53]. The utility of such approach for forecasting ICI efficacy is still to be demonstrated.

PCR-Based Assays
Emerging PCR-based assays differ from the standard reference PCR method by the nature and number of targeted MS markers and/or the read-out strategy of PCR products ( Table 2). For example, some groups integrated the analysis of novel long mononucleotide repeats (LMR) besides the traditional pentaplex panel [58,59]. Bacher et al. identified a large number of mononucleotide repeats with increased repeat length (40-60 bp) compared to the MS markers traditionally used for MSI testing [59]. They showed that MS mutation rate exponentially increases with the number of repeat units. Thus, the analysis of LMR markers allowed to enhance the detection sensitivity of the MSI-PCR assay. The commercial LMR MSI analysis system (Promega ® , Madison, WI, USA) that includes four conventional mononucleotide MS markers (BAT25, BAT26, Mono27, NR21, and NR24) and 4 LMR markers (BAT-52, BAT-56, BAT-59 and BAT-60) notably reached a higher agreement with IHC in colorectal samples compared to the Promega ® MSI analysis system [58]. The utility of such panel still needs to be confirmed for non-colorectal specimens.
Others substituted the capillary electrophoresis (CE) step by either denaturing highperformance liquid chromatography (DHPLC) or high resolution melting (HRM) to detect PCR products from MS markers [60][61][62]. DHPLC has the advantage of enabling high throughput determination of MSI and being exempt from confounding stutter peaks, a frequent artefact observed with PCR-CE and PCR-HRM resulting from DNA slippage during PCR amplification [60,63,64]. The limit of detection for MSI using DHPLC was found as low as 1, mutated out of 100 non-mutated alleles [60].
Others proposed the replacement of the conventional pentaplex panel by a sole MS marker, which has been found highly informative for MSI in CRC. In fact, the high mu-tation rate observed in MSI tumours seems to preferentially appear in large microsatellite regions, such as the T17 intron repeat of the chaperone Heat Shock Protein 110 (HSP110) gene, upstream of the exon 9 splice acceptor site [65]. Somatic deletions in HSP110 (T17) were reported in almost all MSI CRC and were associated with the expression of a mutant truncated HSP110 protein conferring better prognosis and sensitivity to chemotherapy [66,67]. In this context, the research of mutations in the HSP110 (T17) marker has been proposed as an alternative to conventional MSI testing assays since it provides better sensitivity and equal specificity in CRC [68,69] and requires the analysis of a sole quasi-monomorphic marker ( Table 2). Allelic alterations in HSP110 (T17) were historically detected using PCR followed by fragment analysis [68,70], but a novel enrichment-based strategy using E-ice-COLD-PCR seems to better detect MSI with a 20-200-fold gain of sensitivity compared to conventional PCR [71]. The analysis of the HSP110 (T17) region may represent a promising molecular tool for MSI stratification; however, its utility in non-CRC samples is currently unknown and needs to be investigated before its use in clinical practice [72,73].
The Idylla ® MSI assay (Bioartis, NV, Mechelen, Belgium) is a fully-automated PCRbased system that comprises the analysis of seven novel monomorphic homopolymer regions (ACVR2A, BTBD7, DIDO1, MRE11, RYR3, SEC31A, SULF2) in a single-use cartridge with all reagents on board ( Table 2) [74]. In colorectal, endometrial, and gastric cancers, the Idylla ® system provides a high concordance (>96%) and lower failure rate compared to the standard reference methods [75][76][77][78]. Less is known about its performance in other cancers and further studies are needed to confirm its interest in such cases [79]. The Idylla ® benefits from a minimal hands-on-time (<5 min), does not require the analysis of matched normal tissues for MSI interpretation, and gives results in only 2.5 h [78]. Moreover, this easy-to-perform system provides an automated interpretation of the MSI status, which is certainly adapted for in-house testing.
We recently evaluated a droplet digital PCR (ddPCR)-based assay (Bio-Rad, Hercules, CA, USA) using the same five MS markers than those previously described in the Promega ® Analysis system (BAT25, BAT26, Mono27, NR21, NR24) ( Table 2) [78]. This proof-of-concept study demonstrated a high concordance with conventional methods in endometrial (100% agreement, 15/15) and colorectal cancer samples (100% agreement, 15/15). Another group developed a drop-off droplet-digital PCR approach based on the analysis of BAT-26, ACVR2A, and DEFB105A/B markers into three distinct assays [80]. The drop-off ddPCR strategy presents the advantages of screening all variants in a hotspot region. It consists in the use of two TaqMan probes targeting the same amplicon: the VIC-labelled reference probe hybridizes to a nonmutated region while the FAM-labelled drop-off probe is complementary to the wild-type (WT) sequence in a frequently mutated region. In the presence of wild-type alleles, a double signal (VIC+/FAM+) was obtained. In the presence of even a single nucleotide mutation at this site, a loss of FAM signal was observed while maintaining a VIC signal (FAM−/VIC+). The limit of detection of the drop-off MSI-ddPCR was shown as low as 0.1% mutant allele frequency. This approach appears reliable in ascertaining the MSI phenotype in CRC samples (100% overall agreement) while less informative in non-colorectal cancer samples (93% overall agreement). Both MSI-ddPCR approaches appear as potential fast and affordable large-scale tools to screen MSI in multiple samples in one assay. Nevertheless, the analysis of the ddPCR raw data is currently not standardized and hence requires skilled molecular biologists for MSI interpretation.

NGS-Based Approaches and Computational Tools for MSI Diagnosis
Next-Generation Sequencing (NGS) currently represents a widely used technology that facilitates personalized cancer therapy through the research of numerous actionable alterations in a single assay. Recently, NGS has been adapted for the focused purpose of MSI testing. Some groups proposed NGS panels that integrate the specific detection of MSI [81][82][83]. As an example, the MSIPlus amplicon-based approach was designed for the detection of hotspots mutations in the KRAS, NRAS, and BRAF genes and instability in 17 microsatellite loci (including BAT-25, BAT-26, NR-21, NR-24, MONO-27, and HSP110(T17)) in colorectal cancer samples [81].
The analysis of long mononucleotide repeats (>15 bp in length) has long been preferred for MSI determination considering their high level instability in tumours [84]. However, such repeats are more prone to PCR and sequencing errors. The selection of short repeats with length ranging from 7 to 12 bp have been recently shown as an alternative for MSI classification based on the allelic distribution of mutant reads [85,86]. Moreover, shorter microsatellites were shown more monomorphic than longer ones making matched normal tissues unnecessary for the analysis. Gallon et al. optimized the technique by using single-molecule molecular inversion probes (smMIP)-based sequencing approach to detect low-frequency mutant sequences. A panel composed of only six short repeat markers was sufficient to attain 100% accuracy for MSI typing compared to conventional MSI-PCR [85].
The MSI status could also be inferred using NGS data from whole-genome, wholeexome, gene targeted, or RNA sequencing that were not originally developed for MSI diagnosis [62]. During the process of sequencing, numerous MS loci are incidentally captured along with regions of interest and can be easily identified using web-based tools such as MISA (MicroSAtellite identification tool), GMATo (Genome-wide Microsatellite Analyzing Tool), or PolyMorphPredict [26,[87][88][89]. Different MSI computational tools have been developed for MSI diagnosis by exploiting existing NGS data (Table 3). Some computational approaches (such as the MSI-Seq Index, the MSIseq or the MSIPred classifier) diagnose MSI by examining the mutation load in all sequences and/or the insertionsdeletions burden in microsatellite regions [90][91][92]. Others (such as mSINGS, MSIsensor or MANTIS) compare the distribution of the allele lengths at MS regions between tumour and normal samples for MSI interpretation [93][94][95].
The advantage of NGS methods over PCR-based assays is that they provide the MSI status at the same time as other relevant cancer-related alterations. This one-step approach is of particular interest for cancer types with low MSI frequency for which MSIspecific testing is not systematically performed. The widespread deployment of NGS notably highlighted the fact that the MSI status is much more prevalent than previously thought [96]. Moreover, NGS approaches can interrogate many more informative MS loci compared to the MSI-PCR assays. Considering that there is significant differences in MSI patterns across cancer types (specific MS loci unstable in a defined tumour location) [96,97], NGS could thus represent a more sensitive option when applied to cancers other than the well-studied colorectal, endometrial, and gastric tumours [98]. The analysis of multiple MS loci by NGS could also represent an option to retrieve samples for which historical techniques gave non-contributive or doubtful results. The use of NGS for MSI typing is however limited by its cost, high technical complexity, and long turnaround time and are likely to be limited to cases that require a comprehensive genomic profiling including both the MSI diagnosis and the genotyping of other genes of interest.

A Step Forward towards MSI Testing in Liquid Biopsy?
Tumour biopsy represents the reference source of material for MSI determination; however, its clinical use has several limitations. First, tissue material is sometimes inaccessible due to tumour location and results from an invasive procedure associated with potential surgical complications [104]. Second, it only gives a snapshot of the tumour diversity and may not fully capture the complexity of the disease [105]. In some rare cases of sporadic MSI/dMMR tumours, intra-and inter-tumour heterogeneity may arise as a consequence of a late emergence of MMR defects in the tumour development [106]. The characterization of MSI/dMMR status based on a single tissue biopsy could thus lead to MSI misclassification in such cases.
Liquid biopsy (LB), which refers to the analysis of circulating tumour DNA (ctDNA) shed into the body fluids (plasma, urine, saliva, . . . ) by tumour cells, has emerged last few years as a promising surrogate for tumour biopsy. LB is presented as a minimally-invasive and easily-repeatable tool, which overcomes the issue of spatial and temporal heterogeneity and allows the longitudinal monitoring of the disease through iterative sampling [105]. LB has already been applied to detect various cancer-related alterations, including single nucleotide variations, insertion-deletion events, copy number variations, gene fusions or DNA methylation profiles [107]. Numerous studies have revealed the clinical potential of LB in establishing tumour molecular diagnostics, monitoring the treatment response, assessing the minimal residual disease, detecting early tumour relapse as well as tracking secondary resistance mechanisms [105]. LB notably revolutionized the management of patients with non-small cell lung cancers since its approval by the European Medicine agency (EMA) and the FDA for molecular profiling when tumour sampling was not possible or provided non-contributive results [108][109][110]. To date, less is known about its utility in determining MSI status.

Liquid Biopsy Technologies for MSI Detection
Considering the current trend toward tumour molecular diagnostics based on ctDNA analysis and the multiple stakes of MSI on cancer management, there is a growing need to develop novel approaches for MSI diagnosis on blood samples (bMSI). One of the major technical challenges for bMSI determination is that it requires highly sensitive methods due to the highly fragmented nature of ctDNA and the small fraction of ctDNA among total cfDNA in the body fluids (as low as 0.01% in early stages of cancers) [111]. In the last few years, significant progress has been made in improving the resolution of existing tissue-based approaches and adapting them for liquid biopsy purpose.

Methods for Enrichment of MSI Sequences in cfDNA
In cfDNA, altered microsatellite sequences can be missed due to their high contamination with germline DNA molecules and the presence of stutter bands at homopolymer regions when CE or HRM is employed for end-point detection. The enrichment of unstable microsatellites was proposed as an option prior to PCR-based amplification to improve the detection of low-frequency mutant alleles. The NaMe-PrO (nuclease-assisted minor-allele enrichment with probe-overlap) approach consists of overlapping oligonucleotide probes and double-strand-specific nucleases, which specifically eliminate long unaltered homopolymer regions while sparing those harbouring indels. NaMe-PrO combined with CE or HRM attained a limit of detection of 0.01% mutant microsatellite allele frequencies [112,113]. However, this strategy is limited by the number of MS that could be targeted.
In order to efficiently capture low fraction cfDNA present in plasma samples, Yu et al. developed an inter-Alu-PCR-NGS approach that combines a fast and easy-to perform PCR assay with a NGS-based broad molecular profiling [114]. Alu sequences are short interspersed nuclear elements that accounts for almost 11% of the human genome [115]. Alu elements harbour a 3 polyA tail forming a microsatellite-like sequence with variable length. Alu primers target regions away from the alu sequence and amplify homopolymer sequence between two neighbor alu elements, enabling the enrichment of multiple MS loci in meantime. As primers are composed of NGS adapter sequences, the library construction can be performed directly after inter-Alu-PCR. A custom MSI-tracer algorithm finally compares the homopolymer read length between cfDNA from cancer patients and healthy volunteers. Using as little as 0.1-1 ng of cfDNA, Inter-Alu-PCR-NGS was able to distinguish plasma DNA from patients with and without microsatellite instability.

ddPCR Assays as a Viable Option for bMSI Determination
Silveira et al. evaluated the analytical performance of the previously described 3marker (BAT-26, ACVR2A, DEFB105A/B) drop-off ddPCR approach on blood samples from patients with advanced or metastatic CRC and endometrial cancers [80]. Using tissue MSI status as the gold standard, the MSI-ddPCR method attained 100% sensitivity and specificity. Moreover, it provided absolute quantification of the MSI sequences, making this approach compatible with longitudinal ctDNA monitoring.

Improvement of Tissue-Based NGS Approaches and Computational Algorithms for bMSI Determination
The widespread employment of ctDNA-based NGS approaches is limited by prevailing biological and technological hurdles. The low abundance of ctDNA in body fluids along with technical artifacts generated during library preparation, amplification, or sequencing lessen the analytical sensitivity of these methods [116]. In the last few years, advances in NGS technology have been made in order to optimize the detection of low-frequency ctDNA. Such technical improvements were notably employed in recent integrated cfDNAbased pan-cancer NGS approaches designed for bMSI ascertainment [117][118][119] (Table 4). The Guardant360 ® CDx (Guardant Health, Redwood city, CA, USA) and FoundationOne ® Liquid CDx (Foundation Medicine, Cambridge, MA, USA) are commercial FDA-approved blood-based companion diagnostics that have been adapted for specific bMSI determination. The Guardant360 ® CDx, FoundationOne ® Liquid CDx, OncoLBx and Georgiadis approaches all employ hybrid-capture enrichment of target regions and rely on molecular barcoding to help filter false positive events arising due to technical PCR errors. They also integrate in silico error correction approaches in order to reduce the background noise and accurately recover true insertion-deletion events in MS regions. Due to the fragmented nature of ctDNA, Willis et al. showed that some MS loci traditionally used for tissue-based MSI testing are not suitable for ctDNA-based NGS approaches due to low coverage or background noise among these regions [118]. In Guardant360 ® panel, they selected limited but highly informative MS loci based on their coverage and noise profiles in order to improve molecular capture and mapping efficiency. Along with bMSI status, Georgiadis and FoundationOne ® Liquid CDx methods offer the possibility to determine blood based tumour mutation burden (bTMB), a complementary biomarker that help inform ICI treatment. For all panels described, a minimum of 5-30 ng cfNDA input was required. To note, some groups implemented specific bMSI computational algorithms for NGS data in the context of liquid biopsy analysis, enabling the detection of very low fraction of ctDNA (0.05-0.5%) ( Table 5).

Perspectives of the Applications of Liquid Biopsy in MSI Testing
A good overall agreement has been observed between conventional MSI tissue-based testing and newly developed ctDNA-based approaches [80,117,118]. This suggests that ctDNA-based MSI diagnosis could be performed as part of routine clinical practice to stratify patients with better prognosis that are likely to benefit from ICI, when tissue specimens are unavailable or scarce [117,118,122]. Through its minimally-invasive nature, liquid biopsy can be serially repeated in order to ensure real-time disease monitoring based on ctDNA kinetics. Changes in bMSI levels during ICI treatment correlated well with those of other ctDNA markers and reliably reflected tumour response to treatment [80]. In a limited subset of patients under ICI treatment, the residual bMSI allele burden was found inversely correlated with the overall and progression-free survival and allowed an earlier prediction of tumour response compared to conventional radiographic imaging [117]. Longitudinal ctDNA analysis also allowed the detection of somatic MSI acquisition that can appear during cancer evolution in patients initially diagnosed with MSS tumours [127]. To our knowledge, only few cases were demonstrated to acquire MSI phenotype during the disease course [127,128]; however, such phenomenon may have been underestimated given that most tumours are screened for MSI only at the time of diagnosis in routine practice. The interest of such acquired MSI phenotype to guide treatment decision remains elusive and needs to be demonstrated by further studies. The use of such strategy based on the analysis of serial plasma samples, however, dramatically increases the cost of MSI testing. In this context, cost-effectiveness analyses should be performed prior to its implementation in clinical practice.

Conclusions
Determination of MSI status in cancers is of particular clinical importance considering its diagnostic, prognostic, and therapeutic significance. IHC and pentaplex PCR account for MSI/dMMR standard reference methods on tumour tissue specimens. However, other approaches (such as custom NGS approaches or computational algorithms for NGS data, real-time PCR or ddPCR assays using custom MS panels) have emerged and progressively entered into clinical practice. Given the multiple methods currently available, the approach to be used should be chosen considering the cancer type, the preanalytical conditions, the lab resources and technical expertise, and the availability of paired normal tissues.
In the last few years, liquid biopsy led to a major paradigm shift in oncology, providing a viable surrogate to tissue biopsy for molecular investigations. As other ctDNA markers, bMSI can be detected in body fluids and contributes to predict treatment efficacy and follow disease evolution over time. The use of bMSI-based strategies has been initially confounded by lack of sensitivity; however, recent technological advances in the field showed potential in reducing background noises and enhancing detection efficiency. Further translational studies are needed to confirm the clinical utility of bMSI-based approaches and delineate their potentials applications in routine practice.

Conflicts of Interest:
The authors declare no conflict of interest.