Evolutionary genomic comparisons of chelonid herpesvirus 5 (ChHV5) from fibropapillomatosis-afflicted green ( Chelonia mydas ) , olive ridley ( Lepidochelys olivacea ), and Kemp’s Ridley ( Lepidochelys kempii ) sea turtles.

Simple summary: Our research aims to unravel uncertainties relating to the genetic and viral causes of the debilitating sea turtle disease, fibropapillomatosis, which affects all seven species of sea turtle. This disease is likely caused by an alphaherpesvirus (ChHV5) and an environmental trigger (e.g. pollution). Fibropapillomatosis is characterised by multiple benign tumours which grow on the skin, eyes and internal organs, and is becoming a threat to sea turtle conservation globally. ChHV5 research is crucial to better provide effective management and conservation of turtles from this disease. This study aimed to compare ChHV5 genomes between geographic regions and sea turtle species, and observe how this virus has evolved and changed. ChHV5 genomes harboured differences within and between geographic regions (88-2793 single nucleotide polymorphisms [SNPs] per sequenced genome). Multiple ChHV5 genes were also found to be under varying selective pressures. Phylogenomic and phylogenetic analyses revealed grouping of the virus mostly by geography rather than by species and found differences in ChHV5 genomes between tumours from the same individual. This study leads to way into a phylogenomic approach to ChHV5 research. This study provides the most comprehensive picture to-date of whole-genome inter-species ChHV5 diversity, and provides important baseline ChHV5 genomic data for future comparisons. Abstract : The spreading global sea turtle fibropapillomatosis (FP) epizootic is threatening some of Earth’s ancient reptiles, adding to the plethora of threats faced by these keystone species. Understanding this neoplastic disease, and its likely aetiological pathogen, chelonid alphaherpesvirus 5 (ChHV5), is crucial to understand how the disease impacts sea turtle populations and species and the future trajectory of disease incidence. We generated 20 ChHV5 genomes, from three sea turtle species, to better understand the viral variant diversity and gene evolution of this oncogenic virus. We revealed previously underappreciated genetic diversity within this virus (with an average of 2,035 single nucleotide polymorphisms [SNPs], 1.54% of the ChHV5 genome) and identified genes under the strongest evolutionary pressure. Furthermore, we investigated the phylogeny of ChHV5 at both genome and gene level, confirming the propensity of the virus to be interspecific with related variants able to infect multiple sea turtle species. Finally, we revealed unexpected intra-host diversity, with up to 0.15% of the viral genome varying between ChHV5 genomes isolated from different tumours concurrently arising within the same individual. These findings offer important insights into ChHV5 biology and provide genomic resources for this oncogenic virus.


Introduction
Fibropapillomatosis (FP) is a debilitating neoplastic disease which has been reported in all seven species of sea turtle [1], species of which range from vulnerable to critically endangered [2]. The disease has a global spread but with prevalence in specific populations varying considerably [3][4][5][6][7][8]. First described in the scientific literature in the 1930's [9], FP is globally distributed but prevalence also varies considerably among species [3][4][5][6][7][8]. This disease is most prevalent in green turtles (Chelonia mydas) which also tend to be the most severely afflicted; however, FP has been documented to a lesser extent in all other species [5,7,10,11].
Fibropapillomatosis manifests as multiple tumours that primarily arise from the soft tissues of sea turtles including: cutaneous, ocular and visceral tumours, which can vary in size and distribution [12,13]. These tumours can be severely debilitating; impairing vision, locomotion, feeding, predator evasion and other natural behaviours, and preventing affected turtles from providing their valuable ecosystem services and keystone species functions [1,11,12,14,15]. This disease also afflicts turtles at crucial life-stages, juvenile turtles develop FP following recruitment from the oceanic zone into their neritic foraging areas [16].
Molecular studies have consistently detected ChHV5 presence in turtles with FP, however, FP may also be found in turtles where ChHV5 is not detected [26,39], and ChHV5 is detectable in turtles without FP [40,41]. Recently, a strain of papillomavirus, PV1, was detected in 47% of FP tumours analysed, despite earlier conventional PCR-based approaches and whole genome sequencing failing to detect PV1 [25,26]. This suggests that other oncogenic viruses may contribute to the development of FP. Transmission of ChHV5 likely occurs through direct contact, shedding of virus from viral laden tumours into the environment and through vectors such as marine leeches [25,42,43]. No transmission data currently exists regarding PV1.
Multiple researchers have linked the occurrence of FP with various forms of anthropogenic habitat degradation [44][45][46][47] leading to the current hypothesis that the disease is caused by viral infection in conjunction with environmental cofactors. Such previous data indicate a latent state of this virus which may recrudesce in times of immunological stress, enabling ChHV5 loads to pass an oncogenic threshold [5,7,20,24,25], however, specific co-factors and their role in tumourigenesis have yet to be identified. FP has occurred in isolated regions globally within a relatively short timeframe, with differing geographic variants making it unlikely that recent virulence mutations in the virus independently evolved to drive these outbreaks [14]. "It is far more likely that changes in the environment or ecological factors that affect virus transmission or disease expression explain the recent upsurge in disease prevalence almost simultaneously" and that "these disease outbreaks are likely induced by environmental factors rather than the virus transmitting to new populations or undergoing mutational adaptation" [14].
Phylogenetics has been used to investigate ChHV5 transmission dynamics [14,16], to study the evolution of ChHV5, and phylogenetics has identified a number of regional variants [4,5,14,16,35,43,[47][48][49][50]. These studies showed that the global distribution of CHV5 in sea turtle populations predates the awareness of an FP epizootic in the 1980s and 1990s, suggesting that co-factors contributed to disease emergence [14]. While several ChHV5 variants have been identified, no viral variant has ever been associated with disease severity or outcome. Studies have found that at a local scale, sympatric species of marine turtle can share variants of ChHV5 indicating a strong geographic influence on viral phylogeny [4,14,23,47].
Nevertheless, further classification of variants based on the entire ChHV5 genome, may enhance our understanding of ChHV5 evolution, spread of the virus, and detection and interpretation of emerging mutations [39]. Furthermore, ChHV5 genomic studies may help explain slight differences in disease manifestation in turtles from different regions (e.g. high prevalence of oral tumours in Hawaiian green sea turtles).
To date, global phylogeography of ChHV5 has been explored somewhat. Herbst et al. (2004) [4] identified two major global clades of ChHV5, each with Atlantic and Pacific strains. Further, Patrício et al. (2012) [47] proposed four major clades: eastern Pacific, mid-west Pacific, western Atlantic/eastern Caribbean and Atlantic. Greenblatt et al. (2005) [51] also identified a ChHV5 variant from Puerto Rico, which at the time, did not cluster with any other known ChHV5 variants, but has since been clustered with Gulf of Guinea variants [47]. On a more local scale, distinct variants have been identified in some locations. Florida has four known variants of ChHV5 (known as variants A, B, C and D [variant D only from C. caretta]); as well as Hawaiian variants and more recently, Australian variants identified in Queensland [5,14].
The individual genes frequently used for ChHV5 phylogenetics include, UL18, UL30, glycoprotein B (gB) and F-sial [8,12,47]. Conventional PCR coupled with Sanger sequencing of individual gene fragments has been the predominant technique to date for ChHV5 phylogenetic analysis.
Relying on short individual gene fragments has yielded significant results, but is somewhat restrictive and can lead to a limited picture of the true genetic and phylogenetic diversity amongst ChHV5 variants globally [39].
The first study to construct a large multi-gene sequence of ChHV5 was Herbst et al. (2004) [4], who configured a partial genome 43,843bp in length (genes UL9-30). Currently, the most complete ChHV5 reference genome constructed is 132,233bp long, primarily only lacking repeat regions [52]. Morrison et al. (2018) [39] established large multi-gene sequences to compare ChHV5 gene diversity from eight tumour samples, using short-read Next Generation Sequencing (NGS) of long-read PCR products of 72,828bp in length (roughly 55% of ChHV5's current known genome size) aligned to the ChHV5 reference genome [52]. Morrision et al. (2018) [39] also used a smaller subset of genes (Amplicons IV, V, UL30 and gB) of 6,280bp in length for phylogenetic analysis. Increasingly, Next Generation Sequencing (NGS) approaches are being utilised more widely to study ChHV5 [6,7,20,24,25,39], as this powerful analysis tool can provide comprehensive genomic data of study organisms.
Only one whole-genome phylogenomic study of a chelonian herpesvirus has been conducted to date by Origgi et al. (2015) [53], who used NGS methods to construct and observe the phylogeny of testudinid herpesvirus 3 (TeHV3), a close relative to ChHV5.
To advance our understanding of genome-level ChHV5 diversity across sea turtle species within the eastern USA, we applied NGS-based approaches to 20 novel FP tumour samples collected, from three species of sea turtle. Using these whole-genome data we conducted ChHV5 phylogenomics and investigated ChHV5's genomic diversity and evolution.  Table 1). Samples were obtained as part of separate studies to investigate host and viral dynamics of ChHV5 in tumour samples, for full sampling details please see the respective papers [7,23,25,54], Samples (from both internal and external tumours) were stored in RNA-later (Qiagen) at -80C, or dry at -80C, until extraction. Samples were stored between <1 day and 27 months prior to DNA isolation.
2.2 DNA isolation, library preparation and sequencing from tissue and eDNA samples Sequencing of samples was conducted as part of separate studies to investigate host and viral dynamics of ChHV5 in tumour samples, 13 green sea turtle samples [7,25], 6 Kemp's ridley samples [23] and an Olive ridley sample [54].

Quality control and read trimming
All bioinformatic processing was conducted on the Galaxy platform

Nucleotide and gene diversity analysis
The consensus sequences were used to generate nucleotide diversity data and identify positive and reduced selection processes of each ChHV5 gene (Supplemental Table 3). Each gene was isolated from each of the 21 genome sequences (including reference, which was used for comparison) using extractseq (version 5.0.0) on Galaxy, inputting gene regions and opting to extract each region to a new sequence.
Next a purpose written script was created (deposited in Github: https://github.com/klyetsko/Whitney-SeaTurtle-FP), which first, replaced the header for each gene (to the name of the origin sample as well as gene position in genome) then each gene was separated from each consensus sequence into a new file, resulting in 104 files (one file for each of the known 104 ChHV5 genes) with each file containing the sequences for that gene from all 21 samples (one reference sequence and 20 consensus sequences).
The resulting gene text files were then input into DnaSP (version 5) for ChHV5 gene-by-gene analysis to the reference. To note, for the next step, the reference genome sequences were the first sequence in each file so DnaSP programme can use that for comparison. Each gene file was opened in DnaSP, selecting "DNA divergence between populations" and "polymorphism/divergence data" to obtain the relevant nucleotide diversity statistics including number of polymorphic sites, nucleotide diversity and Tajima's D statistic, which can infer selection pressures under the right demography, for each ChHV5 gene (Supplemental Table 3).

Phylogenetic/phylogenomic analysis
Consensus sequences were input into MEGA X, for phylogenetic analysis.
For phylogenetics/phylogenomics of generated consensus sequences, whole genomes and relevant genes (ChHV5 UL30, at 2019 and 483bp) were isolated using Range Extractor (https://www.bioinformatics.org/sms2/range_extract_dna.html) and were compared with known available gene sequences (of the same length and position) from NCBI database. For phylogenomic analyses, a new alignment build was created, consensus sequences or isolated genes were inserted, and the entire alignment was then exported in Mega format. Phylogenetic/genomic trees were constructed using the Maximum Likelihood method (Tamura-nei model).

Patient "Yucca" whole genome phylogenomics
Samples of seven tumours (three kidney tumours and four external tumours

Results
Twenty novel FP tumour samples from 13 sea turtles were utilised for this study (   Interestingly, the two genes with the highest Tajima's D values were both tegument proteins (F-UL37 and F-UL36), suggesting that diversity in tegument proteins is maintained and beneficial to ChHV5 survival and propagation ( Table 2).
The ChHV5 gene under the strongest positive selection was F-UL10 (Glycoprotein M) ( Table 2), suggesting that the sequence of this gene is too critical to ChHV5 function to allow large amounts of sequence diversity to evolve. F-UL41, otherwise known as virion host shutoff (vhs) protein, also of interest, is known to play a role in evading host innate immunity in other organisms and is highly conserved between alphaherpesviruses [58][59][60].  Fig. 1). However, the specific Tajima's D value and direction (positive or negative) of each gene varied widely between the two studies (Supplemental Fig. 1 (Fig. 1C). While more individual genes (55%) had a negative Tajima's D value (Fig. 1B), the genome-wide comparison includes non-coding genomic regions which may be responsible for the skew towards positive values (Fig. 1C).

Phylogenomics reveals clustering of ChHV5 by geographic trends
In order to make whole-genome phylogenomic comparisons all 20 sequenced samples had genome consensus sequences generated for phylogenomic/genetic analysis against the reference ChHV5 genome. Due to limited available ChHV5 genomic data, aside from our 20 samples [7,23,25,54], the only publicly available ChHV5 genome currently available for comparison is the reference genome [52] which originates from a Hawaiian green turtle (C. mydas).
Therefore, all consensus sequences were compared to this sole reference genome ( Fig. 2A). To analyse phylogenetic relationships between our 20 consensus genome samples and ChHV5 from other geographic regions individual gene fragment approaches were used (Fig. 3A,B). At the whole genome level, all 13 novel green turtle-derived ChHV5 genomes (eastern US) generated as part of this study cluster together, but form a discrete grouping which is distinct from the green turtle-derived ChHV5 reference genome (Hawai'i) ( Fig. 2A). It is interesting there is not any segregation in clustering between ChHV5 genomes from South Carolina and Florida, suggesting a similar transmission source, despite FP only recently being reported in the Carolinas [22]. The next cluster is represented by ChHV5 from two Kemp's ridley individuals, also from the east coast of the USA. They cluster closely with but remain distinct from the eastern US green sea turtle cluster. Similarly, the ChHV5 genome from the olive ridley exists in its own group, and an intermediary between the two larger clusters. The olive ridley individual stranded in Texas (Gulf of Mexico), and so may host a slightly different variant to the ChHV5 genome from the USA east coast. Additionally, the olive ridley individual was partially decomposed upon stranding [54] so the true extent of ChHV5 sequence diversity may not have been recovered due to sample degradation. Samples generally clustered according to geographic location of stranding, with the exception being the four Kemp's ridley samples sequenced with viral enrichment. However, these four samples may group with the reference genome due to limited reads covering many coding genes (see below), despite a high overall genome coverage.

Phylogenetics highlights close relatedness of novel sequences and ChHV5
Florida variants A-C.
As there is only one publicly available comparative whole genome sample (the reference ChHV5 genome), it was not possible to generate a more global ChHV5 genome phylogeny, with broader geographic variants. Such analyses will only become possible as whole genome sequencing becomes more widely applied to ChHV5. Therefore, we next used single gene (UL30; DNA polymerase) phylogenetics (for which more geographically diverse ChHV5 gene-level sequencing data exists) to explore the geographical relationship between ChHV5 sequenced from our samples and those of previous studies. As even within UL30 studies there is a disparity between the number of available sequences depending on fragment length and gene position used, we opted to analyse two sets of UL30 data, one using a longer 2,019 bp gene fragment, but with fewer available sequences, and a second using a shorter 483 bp UL30 fragment, for which more sequences from a wider geographic range are available. Only samples generated from this study with over 50 UL30 aligning reads (Supplemental Table 4 The 2,019 bp UL30 gene fragment revealed three major clades with some smaller sub-groupings. Interestingly, in the largest clade are all 13 samples derived from the novel green turtle samples included in this study (Fig. 3A). All of the novel green turtle ChHV5 UL30 sequences, and the two included novel Kemp's ridley sequences clustered closely with the previously reported Florida ChHV5 variants A-C (which are almost identical [4], although remain distinct. Across the 6,801 bp used to define these variants, there is only 9 bp differences between variants A-C, but 383 bp differences in variant D, while there are 145 bp differences between Florida variant A and the Hawaiian variant sequence [4]. This further confirms that the variants between these two species are very similar [23] (Fig. 3A). The two Hawaiian ChHV5 UL30 sequences (one reference and one variant) form their own distinct clade with the two differing only slightly (0.0005 substitutions per site).
These phylogenetic results concur with previous findings [4,23,54], highlighting that similar variants of ChHV5 can be present in sympatric species.
Analysis of the shorter ChHV5 UL30 partial gene fragment (~483 bp), from generated sequences, concurs with the previous, larger gene fragment analysis, with all novel ChHV5 samples clustering closely with Florida variants A-C (Fig. 3B).
This grouping also includes a ChHV5 sequence originating from the Caribbean  (Fig. 3B), whereas before it was separate and somewhat intermediary between the Hawai'i and Florida variants (Fig. 3A).
This finding also confirms previous analysis of the UL30 short gene fragment of this sample (Frandsen et al. 2021 Fig. 4B) tumour samples clustered with Florida variants A-C at the UL30 gene level (Fig. 3 A,B), there were some differences in the ChHV5 sequences between these Yucca tumour samples. Across the full genome Yucca's tumour ChHV5 derived genomes varied to the reference genome with a nucleotide diversity range of 1.96% to 2.05% (Fig. 4A). When the two most divergent Yucca ChHV5 genomes (samples yuRKTW and yuRKTM) were compared directly with each other, the inter-tumour variance of ChHV5 nucleotide diversity within this individual was 0.15%. There were 198 base-pair differences between the full viral genomes obtained from tumour yuRKTW and tumour yuRKTM. Interestingly, the two samples with the greatest nucleotide variation were both kidney FP tumours.
These two samples (yuRKTW and yuRKTM) are from separate tumours, both of which were present in Yucca's right kidney. yuRKTG was a third tumour from the right kidney, which clustered most closely with yuRKTM (Fig. 4A). 4. Discussion

Nucleotide diversity of ChHV5
The severity of wildlife diseases is becoming increasingly exacerbated by anthropogenic activity, a trend projected to worsen over coming decades [61][62][63][64].
The increasing geographic spread of the sea turtle FP global epizootic threatening sea turtle conservation, is likely driven by rising human-related detrimental changes in the marine environment. Better understanding of FP's likely aetiological agent, ChHV5, is crucial to determine the contribution of viral evolutionary versus environmental factors in driving this spread, and for mitigation and epidemiological modelling efforts. We report here that the number of SNPs in each of our green turtle tumour derived consensus ChHV5 genomes ranged from 2,392 to 2,793 when the eastern USA ChHV5 genomes were compared to the Hawaiian ChHV5 reference genome. A total of 1,001 fixed nucleotide differences in ChHV5 partial genome sequences were obtained from Pacific (Hawaiian) and Atlantic green turtle populations by a previous study [39]. We also identified between 88 and 2,284 SNPs  [20,[23][24][25]. Together, these results suggest that significant ChHV5 genetic variation can occur, and that current variant calling based on short stretches of ChHV5 nucleotides, likely underrepresents the true extent of ChHV5 variants globally. In the genomes assessed here alone, the maximum sequence divergence was 2.1% of the ChHV5 genome, between the Hawaiian ChHV5 reference genome and that of a ChHV5 genome obtained from a green turtle that stranded in Ormond Beach, northeast Florida, USA.
Thus far, no ChHV5 variants have been tightly correlated to FP disease severity [11,16,39,65,66]. While ChHV5 variants may not determine disease severity, with environmental co-factors, and host immune systems potentially playing a larger role [7,11,25], it is also possible that previous studies utilising small cohorts of genes to distinguish variants may have missed some more subtle nucleotide diversity between variants. There is the potential that low level variation (SNPs) within individual genes may correlate with disease severity, a feature not amenable to investigation when only using small fragment sizes for variant calling. For example, slight nucleotide divergence within the human Sars-CoV-2 virus have been linked to increased ease of transmission, enhanced immune evasion, more severe illness and vaccine escape [67]. Next generation sequencing approaches are more amenable to the detection of such variation, and should be more widely applied to ChHV5. While unlikely to rival Sars-CoV-2 variant monitoring programmes, the continued cost reduction of NGS approaches [68][69][70] means that they are likely to become more widespread for non-zoonotic wildlife disease also [62,71]. As revealed here and in Morrison et al. (2018) [39] ChHV5 genomes harbour many SNPs, the potential functional changes of these SNPs remain unknown.
The multi-species ChHV5 whole genomes generated for this study represent an important resource for assessing geographic sequence diversity of  Table 2) [72]. Another gene of interest under balancing selection was F-UL41 (0.56, Table 2), a host shutoff protein, which is key to evading the innate immune system.
Variance in this gene is likely beneficial to ChHV5, perhaps enabling it to evade innate immunity across diverse populations and species. In contrast, the gene with the lowest Tajima's D (positive selection, represents excessive low-frequency SNPs) was UL10 (-1.94, Table 2), otherwise known as glycoprotein M (gM) [73].
This gene is highly conserved between alphaherpesviruses and it participates in multiple phases of the viral life cycle. The majority of ChHV5 genes are practically silenced in FP, however F-UL10, F-UL36 and F-UL41 were among the small subset of ChHV5 genes with active expression in FP tumours, as detected by RNA-seq in a previous study [7]. F-UL10 and F-UL36 expression in particular was detected in multiple FP tissue types (lung, kidney and external tumours), whereas F-UL41 was only detected in external FP tumours [7].
The ChHV5 genes identified here with a higher frequency of maintained allele diversity (balancing selection) (Fig. 1B) may serve as good candidates for future ChHV5 phylogenetic analyses given the high variability between these genes in different ChHV5 genomes.

ChHV5 phylogenomics
To avoid the bias and reduced sensitivity that can occur with single-gene phylogenetics, it is recommended that more studies from diverse geographic locations begin to adopt whole viral genome approaches. As has been highlighted by the ongoing human Covid-19 pandemic, viral genome sequencing can be an efficient and rapid means of assessing global viral diversity and of identifying variants of concern, even in an actively evolving situation [67].
All generated samples were closely related to, but distinct from Florida variants A-C, with Kemp's ridley clustering closest to these variants, followed by all green turtle-derived ChHV5 samples (Fig. 3A), further highlighting ChHV5 interspecies transmission, despite this being considered abnormal for herpesviruses [75]. Such interspecific transmission suggests that host or environmental exposure differences likely drive FP prevalence rates observed between species, rather than solely as a function of species-specific ChHV5 variants.
Differences occurred in phylogeny of some samples between the two versions of the UL30 phylogenetic trees (Fig. 3A,B) ChHV5 samples all clustered together in the same clade (Fig. 3B). This is likely due to the diversity present in the larger fragments being lost, and highlights a potential issues in determining true phylogeny using small segments of DNA rather than whole genomes. Another clear example of changes in phylogeny placement (loss of resolution), based on the selected sequence, can be seen from the olive ridley sample. This sample was distinct from Florida variants A-C at the whole genome and large UL30 gene fragment level, but groups with the Florida variants A-C using the smaller UL30 gene fragment [54].

Within-host viral diversity
We investigated whether a single individual turtle could host multiple variants of ChHV5 simultaneously. Of patient Yucca's seven sequenced FP tumour samples, there was notable variance in the ChHV5 consensus genomes generated.
These ChHV5 genomes differed from the Hawaiian reference genome by between 1.96% to 2.05% (Fig. 4A), with a maximum Yucca inter-tumour ChHV5 diversity of 198 bp (0.15%). This suggests that either differing variants can infect the same individual, or perhaps more likely that nucleotide changes within the ChHV5 genome can arise in viral genomes within a single host individual. Such within-host diversity can be a crucial mechanism in the development of new viral variants [76,77]. All seven consensus ChHV5 genomes generated from Yucca's samples clustered closely with the other green turtle samples from this study (at both the phylogenomic and UL30 phylogenetic level), however Yucca's samples did not for a uniform clade and were interspersed with ChHV5 genomes (and UL30 gene fragments) from other individuals. This suggests that for future studies sequencing ChHV5 from only a single tumour on an individual may be insufficient for finegrained ChHV5 diversity analysis, and may miss intra-individual diversity.

Conclusions
This study greatly increases the ChHV5 genome-level data available for diversity, evolutionary, and phylogenomic comparisons of this sea turtle FP epizootic-associated virus. It also provides evidence across all known ChHV5 protein coding genes (for three sea turtle species), that different genes are under highly variable selective pressures. The study also highlights the underappreciated genetic diversity present across ChHV5 genomes. Finally, this study reveals previously unknown genetic diversity in ChHV5 genomes among different tumours arising concurrently within the same individual.
Supplemental materials: The following are available online at www.mdpi.com/xxx/s1, Figure S1: Tajima's D statistic for each ChHV5 gene across studies, Table S1: Read, alignment and ChHV5 genome coverage information for all twenty samples used in this study, Table S2: ChHV5 consensus genomes, Table S3: Nucleotide diversity statistics for all ChHV5 genes from the pooled 20 novel ChHV5 genomes versus the ChHV5 reference genome, Table S4:  Data availability statement: All raw data and code produced during this study is included in the supplemental data and/or uploaded to Github (code deposited in

Conflict of interest:
The authors declare no conflict of interest.
Supplemental Material (Appendix A)