Systemic Lupus Erythematosus Patients with DNASE1L3·Deficiency Have a Distinctive and Specific Genic Circular DNA Profile in Plasma

Cell-free (cf) extrachromosomal circular DNA (eccDNA) has a potential clinical application as a biomarker. Systemic lupus erythematosus (SLE) is a systemic autoimmune disease with a complex immunological pathogenesis, associated with autoantibody synthesis. A previous study found that SLE patients with deoxyribonuclease 1-like 3 (DNASE1L3) deficiency exhibit changes in the frequency of short and long eccDNA in plasma compared to controls. Here, using the DifCir method for differential analysis of short-read sequenced purified eccDNA data based on the split-read signal of the eccDNA on circulomics data, we show that SLE patients with DNASE1L3 deficiency have a distinctive profile of eccDNA excised by gene regions compared to controls. Moreover, this profile is specific; cf-eccDNA from the top 93 genes is detected in all SLE with DNASE1L3 deficiency samples, and none in the control plasma. The top protein coding gene producing eccDNA-carrying gene fragments is the transcription factor BARX2, which is involved in skeletal muscle morphogenesis and connective tissue development. The top gene ontology terms are ‘positive regulation of torc1 signaling’ and ‘chondrocyte development’. The top Harmonizome terms are ‘lymphopenia’, ‘metabolic syndrome x’, ‘asthma’, ‘cardiovascular system disease‘, ‘leukemia’, and ‘immune system disease’. Here, we show that gene associations of cf-eccDNA can serve as a biomarker in the autoimmune rheumatic diseases.


Introduction
Systemic lupus erythematosus (SLE) is a highly prevalent human autoimmune rheumatic disease characterized by abnormal T cells, overactive B cells, and the production of large amounts of various pathogenic autoantibodies. SLE causes progressive glomerulonephritis, arthritis, and an erythematous rash. Anti-DNA autoantibodies, and specifically anti-double stranded deoxyribonucleic acid (dsDNA) antibodies, are pathogenic in SLE, while cell-free chromatin-associated long DNA fragments are antigens for anti-DNA antibodies. Although it is still under discussion how and who triggers the production of DNA antibodies in SLE, there is growing evidence that failure in the clearance of cell-free DNA (cfDNA) by deoxyribonucleases (DNASES), and in particular by deoxyribonuclease 1-like 3 (DNASE1L3), can lead to the production of anti-DNA antibodies and SLE [1]. CfDNA is released into blood due to cell death, including necrosis, apoptosis, neutrophil extracellular net formation (NETosis), and pyroptosis, as well as by live cells; then, DNA fragmentation factor B (DFFB) cleaves dsDNA at internucleosome linker regions in chromatin into higher molecular weight DNA fragments, and then into oligonucleosomal fragments; and finally, cfDNA fragments are cleared by DNASES, DNASE1 (for <150 bp fragments), and DNASE1L3 (for larger and chromatin associated DNA) in healthy individuals, while in SLE, cfDNA is accumulated. The defective clearance of long fragments of cfDNA in SLE is largely attributed to impaired deoxyribonuclease 1-like 3 (DNASE1L3) [1].
Cell-free (cf) extrachromosomal circular DNA (cf-eccDNA) is a DNA molecule gaining increasing research interest as a biomarker with clinical applications, thanks to improvements in the eccDNA purification methods and the increased generation of circulomics data. Sin et al. [2] studied the effect of DNASE1L3 on the abundance and size distribution of the cf-eccDNA in SLE patients with DNASE1L3 mutations, and concluded that DNASE1L deficiency in humans leads to lengthening of the eccDNA in plasma. Multiple circulomics approaches have focused on discovering markers based on global features of the eccDNA, such as difference in their length and number between control and treatment conditions. Such global differences lack high sensitivity to distinguish between sample groups, and this problem is even more exacerbated in the case of using plasma samples. To improve the sensitivity of the circulomics analysis, we developed a computational method for differential analysis of eccDNA (DifCir) based on quantifying the number of produced per gene eccDNA (PpGCs) [3]. In a case when neither the transcriptomics methods nor the circulomics methods, based on measuring the abundance and frequency distribution of lengths of the eccDNA, were able to detect differences between sedentary and active skeletal muscle tissue of aged males, DifCir was able to detect very distinctive genic eccDNA profiles.
Here, using the DifCir method on the circulomics data of Sin et al. [2], we show that SLE patients with DNASE1L3 loss-of-function mutation have a distinctive profile of cf-eccDNA excised by gene regions compared to healthy controls; thus, we propose that not only abundance and size distributions, but also specific genic cf-eccDNA profiles can be a biomarker in SLE activity. The experimental setup and the computational workflow we designed to find cf-eccDNA biomarkers for SLE with DNASE1L3 loss-of-function are shown in Figure 1. including necrosis, apoptosis, neutrophil extracellular net formation (NETosis), and pyroptosis, as well as by live cells; then, DNA fragmentation factor B (DFFB) cleaves dsDNA at internucleosome linker regions in chromatin into higher molecular weight DNA fragments, and then into oligonucleosomal fragments; and finally, cfDNA fragments are cleared by DNASES, DNASE1 (for <150 bp fragments), and DNASE1L3 (for larger and chromatin associated DNA) in healthy individuals, while in SLE, cfDNA is accumulated. The defective clearance of long fragments of cfDNA in SLE is largely attributed to impaired deoxyribonuclease 1-like 3 (DNASE1L3) [1]. Cell-free (cf) extrachromosomal circular DNA (cf-eccDNA) is a DNA molecule gaining increasing research interest as a biomarker with clinical applications, thanks to improvements in the eccDNA purification methods and the increased generation of circulomics data. Sin et al. [2] studied the effect of DNASE1L3 on the abundance and size distribution of the cf-eccDNA in SLE patients with DNASE1L3 mutations, and concluded that DNASE1L deficiency in humans leads to lengthening of the eccDNA in plasma. Multiple circulomics approaches have focused on discovering markers based on global features of the eccDNA, such as difference in their length and number between control and treatment conditions. Such global differences lack high sensitivity to distinguish between sample groups, and this problem is even more exacerbated in the case of using plasma samples. To improve the sensitivity of the circulomics analysis, we developed a computational method for differential analysis of eccDNA (DifCir) based on quantifying the number of produced per gene eccDNA (PpGCs) [3]. In a case when neither the transcriptomics methods nor the circulomics methods, based on measuring the abundance and frequency distribution of lengths of the eccDNA, were able to detect differences between sedentary and active skeletal muscle tissue of aged males, DifCir was able to detect very distinctive genic eccDNA profiles.
Here, using the DifCir method on the circulomics data of Sin et al. [2], we show that SLE patients with DNASE1L3 loss-of-function mutation have a distinctive profile of cf-eccDNA excised by gene regions compared to healthy controls; thus, we propose that not only abundance and size distributions, but also specific genic cf-eccDNA profiles can be a biomarker in SLE activity. The experimental setup and the computational workflow we designed to find cf-eccDNA biomarkers for SLE with DNASE1L3 loss-of-function are shown in Figure 1.  [2]. Isolation and purification of circular DNA from plasma of SLE and HC [2] and subsequent assembly, annotation, quantification of eccDNA species, quantification of produced per gene circles (PpGCs), and calculation of differentially PpGCs (DPpGCs). Functional enrichment analysis of the DPpGCs performed with Harmonizome and GSEA.  [2]. Isolation and purification of circular DNA from plasma of SLE and HC [2] and subsequent assembly, annotation, quantification of eccDNA species, quantification of produced per gene circles (PpGCs), and calculation of differentially PpGCs (DPpGCs). Functional enrichment analysis of the DPpGCs performed with Harmonizome and GSEA.

Patient Characteristics
Four healthy human participants were recruited from the Department of Chemical Pathology, the Chinese University of Hong Kong (CUHK), with written informed consent. Three human participants with DNASE1L3 mutations were recruited from the Istituto Giannina Gaslini, also with written informed consent. Four blood samples in total were obtained from this patient cohort. Table 1 shows a summary of the patient characteristics based on the description provided in Sin et al. [2].

Data Preprocessing
We used Trimmomatic-0.38 [4] to trim the adaptors from the 2 × 150 bp paired-end read sequenced data from Sin et al. [2].

EccDNA Mapping and Quantification of the Produced per Gene eccDNA (PpGCs)
We mapped the eccDNA using Circle_finder [5], using as arguments the hg38 built of the genome and minNonOverlap between two split reads equal to 10. The mapped reads were ranging from 55% to 100% of the total reads among the 10 samples (Table 2). We filtered out mitochondrial eccDNA and eccDNA with lengths greater than L max = 10 Kb. Next, we merged clusters of circles within a distance smaller than D min = 10 b and added up the split reads of the merged circles. We filtered out the circles with less than JT min = 2 split reads. We annotated the circles using bedtools intersect [6] on the resulting eccDNA bed files with a GencodeHuman38 bed file with the gene coordinates. Then, we unified the circles based on genes, i.e., the split reads of all the eccDNA that carry the same gene or fragment of this gene were added up to obtain unscaled produced per gene eccDNA (PpGC i ) for each gene i. Next, we scaled by gene length, multiplying each PpGC i by a scale factor L Max /L i , where L Max is the length of the longest gene found in the dataset, and L i is the length of the gene i. Finally, we performed data equalization by the log 2 (PpGC + 1) transform of the quantified PpGCs to obtain the final PpGCs. We calculated the average values for each group, HC and SLE. We selected the DPpGCs whose absolute value of difference of mean values between the two groups was less than a selection threshold θ DPpGC = 5-fold change (FC) in log 2 scale. We selected the statistically significant DPpGCs using the Student's t-test, with a significance threshold α DPpGC = 0.01.

Abundance of Unique cf-eccDNA Is Higher in SLE Than in HC
We first checked for the differences in the number and length frequency distributions of the eccDNA in the SLE and HC groups. The cross-chromosomal load of unique eccDNA, up to a size of 10,000 bp, ranges from 109 to 1095 unique eccDNAs (µ ± sem = 552 ± 155) in HC, and 1401 to 47,738 eccDNAs (µ ± sem = 17,212 ± 10,416) in SLE ( Figure 2A) (where µ is the mean and sem is the standard error of the mean). The Wilcoxon rank sum test indicates that the number of eccDNAs in HC is statistically significantly smaller than the number of eccDNAs in SLE with DNASE1L3 deficiency, at a 1% significance level with a p-value of 0.00476, i.e., showing a 0.0321-fold decrease in the eccDNA under the analyzed conditions. unified the circles based on genes, i.e., the split reads of all the eccDNA that carry the same gene or fragment of this gene were added up to obtain unscaled produced per gene eccDNA (PpGCi) for each gene i. Next, we scaled by gene length, multiplying each PpGCi by a scale factor LMax/Li, where LMax is the length of the longest gene found in the dataset, and Li is the length of the gene i. Finally, we performed data equalization by the log2(PpGC + 1) transform of the quantified PpGCs to obtain the final PpGCs. We calculated the average values for each group, HC and SLE. We selected the DPpGCs whose absolute value of difference of mean values between the two groups was less than a selection threshold θDPpGC = 5-fold change (FC) in log2 scale. We selected the statistically significant DPpGCs using the Student's t-test, with a significance threshold αDPpGC = 0.01.

Abundance of Unique cf-eccDNA Is Higher in SLE Than in HC
We first checked for the differences in the number and length frequency distributions of the eccDNA in the SLE and HC groups. The cross-chromosomal load of unique eccDNA, up to a size of 10,000 bp, ranges from 109 to 1095 unique eccDNAs (µ ± sem = 552 ± 155) in HC, and 1401 to 47,738 eccDNAs (µ ± sem = 17,212 ± 10,416) in SLE ( Figure  2A) (where µ is the mean and sem is the standard error of the mean). The Wilcoxon rank sum test indicates that the number of eccDNAs in HC is statistically significantly smaller than the number of eccDNAs in SLE with DNASE1L3 deficiency, at a 1% significance level with a p-value of 0.00476, i.e., showing a 0.0321-fold decrease in the eccDNA under the analyzed conditions.

Cf-eccDNA Length Distribution Is Distinct in SLE Compared to HC
The comparison of the smoothed length distribution profiles of SLE and HC shows lower peak SLE eccDNA clusters compared to the HC for lengths of approx. 200 bp and 350 bp, and a flip to higher peak SLE eccDNA clusters for lengths greater than 650 bp ( Figure 2B). These results agree with the eccDNA size distribution analysis in Sin et al. [2]. The left, center, and right vertexes mark the µσ, µ, and µ + σ of the number of eccDNA length distribution in each of the of the groups of samples, respectively. The lower, center, and upper vertexes mark the µ − σ, µ, and µ + σ of the eccDNA length distribution, respectively. µ and σ are the mean and the standard deviation of the distributions, respectively. Each sample position in the length versus number eccDNA space is represented by a filled blue or red circle for the HC and SLE samples, respectively, with the exception of the SLE sample with 47,738 eccDNA that is located outside the map. Cumulative distribution of the lengths of the eccDNA in the range from (D) 0 to 10 3 bp. (E) 0 to 10 4 bp. The HC and SLE samples are depicted in blue and red, respectively. In the (B,D,E) panels, the shadowed blue and red regions cover the standard deviation of the enrichment across the HC and SLE groups, respectively.

Cf-eccDNA Length Distribution Is Distinct in SLE Compared to HC
The comparison of the smoothed length distribution profiles of SLE and HC shows lower peak SLE eccDNA clusters compared to the HC for lengths of approx. 200 bp and 350 bp, and a flip to higher peak SLE eccDNA clusters for lengths greater than 650 bp ( Figure 2B). These results agree with the eccDNA size distribution analysis in Sin et al. [2]. Interestingly, lupus has been found to have a stronger mononucleosome length footprint than healthy controls, and even all of the studied cancer types in cfDNA [7]. Chan et al. [8] observed an increased frequency of long DNA molecules of above 250 bp and an increased amount of short plasma DNA molecules below 120 bp in lupus patients with DNASE1L3 disease-associated variants compared to healthy controls, but lower frequencies in the length range in-between. The rhomboid plots of the distributions of the lengths of the eccDNA versus their numbers of eccDNA show how the samples are separated between HC and SLE due to the higher numbers of eccDNA in SLE ( Figure 2C). The cumulative eccDNA length curve shows the faster increase of longer eccDNA in SLE compared to HC after a length of 400 bp ( Figure 2D). The majority of eccDNA has a length of less than 10,000 bp ( Figure 2E).  Figure 3A). We observed a similar distribution of the number of PpGCs ( Figure 3B) to that of the total number of unique cf-eccDNAs in all of the samples (Figure 2A). Additionally, the number of genic cf-eccDNAs (PpGC) constitutes a large proportion of the number of cf-eccDNAs for each sample. The pairwise scatter plot of SLE versus HC ( Figure 3C) shows the existence of a difference in genic eccDNA between the two groups, and the volcano plot ( Figure 3D) shows that this difference is statistically significant (significance level α DPpGC = 0.01). We identified 267 statistically significant up-DPpGCs in the SLE group compared to the HC group, and no up-DPpGCs in the HC compared to SLE (Figures 3E and 4). Noteworthy, the top 93 up-DPpGCs in the SLE are specific for SLE and are concurrently present in all SLE samples, but not in any of the HC samples ( Figure 4). We performed a whole-gene length analysis and found that the eccDNA excised from the genes up-producing eccDNA in SLE are carrying only gene fragments and no complete genes. The loci of the top-ranked up-DPpGCs, together with the read coverage, are shown in the track plots in Figure 5 for the top four up-DPpGCs in SLE; the 93 specific up-DPpGCs present in all SLE samples are shown in Figure S1.

Differential Analysis Based on Split Reads Identifies Distinctive Genic cf-eccDNA Profile of SLE
First, we checked the proportion of produced per gene circles (PpGCs) in the HC and SLE samples to confirm the possibility of performing differential analysis based on PpGCs (Figure 3) to detect differentially PpGCs (DPpGCs) between HC and SLE. Even though the cf-eccDNAs are less abundant in the HC samples, their dispersion is approximately equally distributed for each sample ( Figure 3A). We observed a similar distribution of the number of PpGCs ( Figure 3B) to that of the total number of unique cf-eccDNAs in all of the samples (Figure 2A). Additionally, the number of genic cf-eccDNAs (PpGC) constitutes a large proportion of the number of cf-eccDNAs for each sample. The pairwise scatter plot of SLE versus HC ( Figure 3C) shows the existence of a difference in genic eccDNA between the two groups, and the volcano plot ( Figure 3D) shows that this difference is statistically significant (significance level αDPpGC = 0.01). We identified 267 statistically significant up-DPpGCs in the SLE group compared to the HC group, and no up-DPpGCs in the HC compared to SLE (Figures 3E and 4). Noteworthy, the top 93 up-DPpGCs in the SLE are specific for SLE and are concurrently present in all SLE samples, but not in any of the HC samples ( Figure 4). We performed a whole-gene length analysis and found that the eccDNA excised from the genes up-producing eccDNA in SLE are carrying only gene fragments and no complete genes. The loci of the top-ranked up-DPpGCs, together with the read coverage, are shown in the track plots in Figure 5 for the top four up-DPpGCs in SLE; the 93 specific up-DPpGCs present in all SLE samples are shown in Figure S1.

Gene Set Enrichment Analysis of up-DPpGCs in SLE Discloses Lupus Related Genes
We performed gene ontology (GO) enrichment analysis using the Gene Set Enrichment Analysis (GSEA) sets [9] for systematic functional association of the genes related to the up-DPpGCs. The top enriched ontology term of the up-DPpGCs in SLE is 'positive regulation of torc1 signaling' due to the detection of the up-DPpGCs in the genes CLEC16A, GPR137C, KLHL22, and WAC ( Figure 6). The nutrient-responsive mammalian target of rapamycin TOR complex 1 (mTORC1) is a master regulator of cell growth, an important bridge between nutrition and metabolism, and impacts diverse processes involved in the promotion of autoimmunity in SLE [10,11]. It participates in the proliferation and differentiation of immune cells, affects the secretion of inflammatory cytokines, and alters cell autophagy levels. Upstream factors that drive increased mTOR activity in SLE are increased reactive oxygen intermediates (ROI) and nitric oxide, decreased levels of the reduced form of glutathione (GSH), and elevated mitochondrial potential. The mTOR pathway is involved in metabolic processes that predispose cells to die by necrosis and release oxidized DNA, which can have proinflammatory consequences [10]. Both upstream factors and mTORC1/2 are molecular targets mTOR inhibitors in SLE patients [11]. C-type lectin domain containing 16 A, CLEC16A, is an immunity-associated gene that modulates thymic epithelial cell autophagy and alters T cell selection, with its silencing in mice protecting against autoimmunity [12]; its variants are associated with susceptibility to SLE [13], and it has reduced expression in peripheral leukocytes in SLE [14]. G proteincoupled receptor 137C, GPR137C, is an enzyme involved in the biosynthesis of gangliosides, functioning as antigens or receptors by recognizing lectins at the cell surface, and by modulating the charge density at the membrane surface. KLHL22 plays a conserved role to mediate the activation of mTORC1 and downstream events in mammals and nematodes, and has been found to promote tumorigenesis and aging through activating amino acid-Cells 2023, 12, 1061 8 of 17 dependent mTORC1 signalling [15]. WAC promotes mTORC1 activity by acting as an adaptor for the proper assembly of the TTT-Pontin/Reptin complex, as studied by David-Morrison et al. [16].    As SLE is an autoimmune disorder of the connective tissue, the other top GSEA term is 'chondrocyte development', represented by the genes CHST11, COL11A, EXT1, and TGFBR2 ( Figure 6). CHST11, which attaches sulfates to the 4-position of unsulfated chondroitin, the major component of cartilage, was found to be highly expressed in cutaneous lupus erythematosus [17]. Collagen alpha-1(XI) chain COL11A polymorphisms have been associated with SLE [18]. Exostosin 1, EXT1, is known to regulate chondrogenesis by modulation of WNT signaling [19]. EXT1 and EXT2 are the primary antigens for a subset of autoimmune diseases, including lupus [20]. They are primary antibodies for some types of membranous glomerulonephritis (MN) [21], with EXT1 positivity of kidney biopsies associated with better kidney function [22]. Transforming growth factor β receptor 2, TGFBR2, together with TGFBR1, binds TGFβ and regulates the transcription of genes related to immunosuppression [23].
Numerous other significant enriched GSEA terms are related to the skeletal system, e.g., 'bone growth', 'embryonic skeletal system development', 'cranial skeletal system development', 'brachydactyly', 'bowing of the legs', and terms related to the connective tissues are 'connective tissue development' and 'cartilage development'. A member gene of some of the enriched terms related to cartilage and skeletal system morphogenesis is our top SLE up-DPpGCs gene BARX2. A SNP related to lupus nephritis (LN) has been found close to BARX2 [24].
Another top and specific SLE up-DPpGCs gene that is part of the skeletal-and cartilagerelated enriched GSEA terms is the transcription factor PAX5, a master regulator of B cell development and leukemogenesis [25]. Abnormalities in B cells play pivotal roles in the pathogenesis of SLE and LN, and dysregulation of B cell transcription factors, cytokines, and B cell-T cell interaction can result in aberrant B cell maturation and autoantibody production, with these immunological abnormalities leading to perturbations in circulating and infiltrating B cells in SLE and LN patients [26].
An intriguing GO biological process term among the top GSEA enrichment terms is 'regulation of signal transduction by p53 class mediator', represented by the genes BDKRB2, C16orf72, HIPK2, and SPRED2, where the bradykinin receptor B2, BDKRB2, is a G-protein coupled receptor involved in the regulation of cardiovascular and renal functions, as well as inflammation, and is a target for the tumor suppressor p53-mediated transcriptional activation [27]. BdkrB2 inactivation results in the upregulation of checkpoint kinase 1 (Chk1) levels and potentiating phosphorylation of p53 on Ser23 by Chk1, an essential step in the pathway leading to renal dysgenesis [28]. Another member of this ontology is an unannotated gene, C16ORF72, renamed by Benslimane et al. [29] as TAPR1 (telomere attrition and p53 response 1), who found that TAPR1 buffers against the deleterious consequences of telomere erosion or DNA damage by constraining p53. Activation of p53mediated gene transcription is a critical cellular response to DNA damage, and involves a post-translational phosphorylation-acetylation cascade of p53. Puca et al. [30] uncovered a role for HIPK2 in activating p53 apoptotic transcription through regulating the balance between p53 acetylation and deacetylation, by stimulating on one hand corecruitment of p300 and p53Lys382 on apoptotic promoters, and on the other hand by inhibiting Sirt1 deacetylase activity. HIPK2 is elevated in rheumatic arthritis tissues, and inhibition of its expression inhibits chondrocyte apoptosis [31]. Additionally, among the enriched terms is 'positive regulation of intracellular signal transduction'.
The list of enriched GSEA terms includes 'dna methylation or demethylation', represented by the FTO, MORC1, PIWIL4, and TET3 genes ( Figure 6). Multiple genetic variants associated with SLE have been found near and in TET3, pointing to the role of DNA demethylation in SLE [32]. Animal studies suggest that Tet2-and Tet3-mediated chromatin modification participated in the repression of CD86 on self-reactive B cells, a mechanism that may contribute to autoimmunity prevention. Tet2-and Tet3-deficient B cells led to hyperactivation of B cells, autoantibody production, and lupus-like disease in affected mice [33].

Overlap between SLE up-DPpGCs and GWAS Catalogue
We checked for the intersection of the list of the SLE with DNASE1L3 deficiency up-DPpGCs and the gene variants in the GWAS catalog [41] of terms related with SLE and glomerulonephritis. For 'childhood onset systemic lupus erythematosus' (2 genes in GWAS) we found one overlap for XKR6 [42]. For 'systemic lupus erythematosus' (449 genes in GWAS) we found 9 in our set of up-DPpGCs, namely, CLEC16A, IL18RAP, LMNTD1, PINX1, PVT1, SLC1A7, SPRED2, TET3, and a different one from the childhood onset risk locus in XKR6 [43]. Additionally, for 'response to cyclophosphamide in systemic lupus erythematosus with lupus nephritis' (12 genes in GWAS), we found two overlap genes, BARX2 and HIPK2. For 'interferon alpha levels in systemic lupus erythematosus' (5 genes in GWAS), there is one overlap gene, ANKRD44, found to function as a negative regulator in the type I IFN signaling pathway, with decreased expression of ANKRD44 constituting one of the mechanisms for the dysregulated IFN response in SLE [44]. The production of interferon-α (IFN-α) is a key feature of the innate immune system in SLE, with IFN-α levels increased in SLE patients during disease flares. For 'acute glomerulonephritis' (1 gene in GWAS) we found OSBPL10 in our up-DPpGCs in SLE. We did not expand the check to autoimmune risk loci, though some of the up-DPpGCs are associated with such sites, e.g., IKZF1 [43]. Another SLE up-DPpGC is UNC13A, a common risk locus for both amyotrophic lateral sclerosis and immune disorders [45].

Whole Genes in cf-eccDNA in SLE and HC
We found that the up-DPpGCs in SLE versus HC carry only gene fragments. However, we checked whether any cf-eccDNA in SLE and HC carries complete genes. We found that the cf-eccDNA from some samples, mainly from SLE, carry whole genes ( Figure 7A). We checked the identity of these wholes genes for the recurrent cases of genes in at least two samples and we found that these correspond to short genes, mainly from the immune system, the T cell receptor α joining 48 (TRAJ48), and the five members of the immunoglobulin light κ chain family (IGKJ) ( Figure 7B).

Discussion
We have identified a set of genes releasing specifically cf-eccDNA in SLE patients with DNASE1L3 deficiency caused by mutations. We showed that the differential analysis of the genic eccDNA, based on the split read signal of the eccDNA, confidently

Discussion
We have identified a set of genes releasing specifically cf-eccDNA in SLE patients with DNASE1L3 deficiency caused by mutations. We showed that the differential analysis of the genic eccDNA, based on the split read signal of the eccDNA, confidently distinguishes SLE from HC, with functional associations of the identified eccDNA-producing genic sites. A large proportion of the cf-eccDNA in SLE and HC is generated by genic regions (Figures 2A and 3B), and the genic information in the eccDNA in SLE and studies of the nucleases on cf-eccDNA has not been exploited so far [2]. The eccDNA profile of SLE with DNASE1L3 deficiency uncovered here is extremely specific compared to other genic eccDNA profiles identified so far in the skeletal muscle of sedentary and physically active aged men [3]. This is remarkable in the sense that here we analyzed circulomics data from plasma, and the eccDNA originates from different sites of the body. Additionally, enrichment of eccDNA is more difficult for plasma samples compared to tissue samples. One can argue that the clear eccDNA profile here in SLE is due to the DNASE1L3 deficiency and the better clearance of the eccDNA in the HC, though we have shown ( Figure 3A) that the HC samples contain enough diverse genic eccDNA. We argue the idea of random release of eccDNA into the circulation with the demonstrated possibility of disease-related genic sites undergoing repair processes, and concordantly releasing eccDNA. Among the specific top up-DPpGCs in SLE with DNASE1L3 deficiency is the lymphocyte antigen 86, LY86, that is upregulated in prenephritic kidneys, and fluctuates with remission and progression towards relapse [46]. Interestingly, two of the patients (three SLE samples) had glomerulonephritis, but one, patient three, still had normal renal function. The up-DPpGCs carry gene fragments and, interestingly, circ-IQGAP2 has been found to be a noninvasive biomarker of primary Sjögren's syndrome, another rheumatic immune disorder [47]. IQ-GAP2, coding a protein required for the glomerular filtration barrier [48], is also specific and present in all samples of SLE up-DPpGC. These are just a few examples of how the SLE up-DPpGCs might not only reflect the actual pathology, but might also predict future disease progression. To confirm that, more samples and follow-up in time would be required.

Conclusions
Cell-free extrachromosomal circular DNA (cf-eccDNA) is more stable than cell-free linear DNA that, analyzed in conjunction with our computational tool for detection of genes with differential abundance of eccDNA (DifCir) from circulomics data, shows that cf-eccDNA is a promising and possibly early biomarker in the autoimmune rheumatic diseases.