The Effects of Selenium Supplementation in the Treatment of Autoimmune Thyroiditis: An Overview of Systematic Reviews

Objective: The available evidence on selenium supplementation in the treatment of autoimmune thyroiditis (AIT) was inconclusive. This research serves to assess the effects of selenium supplementation in the treatment of AIT. Methods: Online databases including PubMed, Web of Science, Embase, and the Cochrane Library were searched from inception to 10 June 2022. The AMSTAR-2 tool was used to assess the methodological quality of included studies. The information on the randomized controlled trials of the included studies was extracted and synthesized. The GRADE system was used to assess the certainty of evidence. Results: A total of 6 systematic reviews with 75 RCTs were included. Only one study was rated as high quality. The meta-analysis showed that in the levothyroxine (LT4)-treated population, thyroid peroxidase antibody (TPO-Ab) levels decreased significantly in the selenium group at 3 months (SMD = −0.53, 95% CI: [−0.89, −0.17], p < 0.05, very low certainty) and 6 months (SMD = −1.95, 95% CI: [−3.17, −0.74], p < 0.05, very low certainty) and that thyroglobulin antibody (Tg-Ab) levels were not decreased. In the non-LT4-treated population, TPO-Ab levels decreased significantly in the selenium group at 3 and 6 months and did not decrease at 12 months. Tg-Ab levels decreased significantly in the selenium group at 3 and 6 months and did not decrease at 12 months. The adverse effects reported in the selenium group were not significantly different from those in the control group, and the certainty of evidence was low. Conclusion: Although selenium supplementation might reduce TPO-Ab levels at 3 and 6 months and Tg-Ab levels at 3 and 6 months in the non-LT4-treated population, this was based on a low certainty of evidence.


Introduction
Autoimmune thyroiditis (AIT) is a chronic autoimmune disease in which human thyroid tissue serves as an antigen. AIT involves the production of autoantibodies such as thyroid peroxidase antibody (TPO-Ab) and thyroglobulin antibody (Tg-Ab), which may trigger cellular and antibody-mediated immune processes that lead to the destruction of thyroid cells [1]. Clinical manifestations include goiter, pharyngeal discomfort, neck compression, and dysphagia [2]. AIT includes Hashimoto's thyroiditis (HT), Graves' disease (GD) and other diseases. Lymphocyte infiltration in HT can gradually destroy follicular cells and lead to hypothyroidism [3]. AIT affects about 5% of the general population and its incidence rate in women is about 4-10 times that in men; the incidence rate increases selenite, selenium yeast, and other forms, with unlimited dosage. (3) The main outcomes included TPO-Ab levels and/or Tg-Ab levels. Exclusion criteria included the following: (1) duplicate reports; (2) studies with insufficient data; (3) thyroid disease during pregnancy; (4) thyroid-associated ophthalmopathy; (5) studies not in English or Chinese.

Search Methods
Online databases including PubMed, Web of Science, Embase, and the Cochrane Library were searched from inception to 10 June 2022, and the retrieval was performed again before the final data analysis. MeSH words and keywords were combined in the search strategy. The main search strategy was as followed (take PubMed search strategy as an example): ("thyroiditis, autoimmune" [MeSH] OR "graves disease" [MeSH] OR "autoimmune thyroiditis" OR "AIT" OR "ATD" OR "hashimoto thyroiditis" OR "HT" OR "hashimoto disease" OR "painless thyroiditis" OR "graves disease" OR "GD" OR "lymphocytic thyroiditis" OR "hyperthyroidism") AND ("selenium" [MeSH] OR "selenium compounds" [MeSH] OR "organoselenium compounds" [MeSH] OR "selen*" OR "Se" OR "ebselen") AND ("systematic reviews as Topic" [MeSH] OR "Meta-Analysis as Topic" [MeSH] OR "meta analys*" OR "systematic revie*" OR "metaanalys*").

Study Selection and Data Extraction
Two reviewers independently evaluated whether the articles met the inclusion criteria. Any disagreements were discussed with a third reviewer and resolved by consensus. This process was documented in the PRISMA flowchart [17].
For the included studies, the two reviewers used the pre-designed extraction table to independently extract the data from each study and crosscheck them. The data extraction content included information such as the first author, year of publication, sample size, research object, intervention measures, main report outcomes, and main results. At the same time, our study extracted the data from RCTs included in the SRs, including the first author, year of publication, sample size, intervention measures, TPO-Ab and Tg-Ab levels of baseline and treatment endpoint, adverse effects, age, gender, etc.

Assessment of Methodological Quality
We used the measurement tool to assess systematic review 2 (AMSTAR-2) [18] to evaluate the quality of the methodology of the SRs. AMSTAR-2 included 16 evaluation items, of which, seven were critical items, where "Y" represented conformity, "N" represented non-conformity, "PY" represented partial conformity, and "NP" represented not applicable because no meta-analysis was conducted. The quality assessment was completed by an online tool (https://amstar.ca/Amstar_Checklist.php, accessed on 2 September 2022), and the overall quality of the study was automatically generated after the assessment was completed. Each study was evaluated as high, moderate, low, or critically low quality. Two reviewers independently evaluated the studies and any discrepancies were discussed with a third reviewer and resolved by consensus.

Certainty Assessment
We used the grades of recommendation, assessment, development, and evaluation (GRADE) system to assess the certainty of evidence and constructed a summary of findings table. The evaluation content included five factors: risk of bias, inconsistency, indirectness, imprecision, and publication bias [19]. Among them, the Cochrane Collaboration risk of bias tool (CCRBT) was used for risk of bias assessment.

Data Synthesis and Analysis
We performed a descriptive analysis of the included SRs. The document management software EndNote 20 was used to import, screen, and manage documents and remove duplicates, and Excel 2021 was used to design data extraction tables and implement data statistics. For the RCTs included in the SRs, STATA 17 was used for data synthesis. The standardized mean difference (SMD) was used for data synthesis for continuous variables because the trials used various measurement scales to measure the same outcomes and relative risk (RR) was used for data synthesis for binary variables. Heterogeneity was assessed using the Q-test and I 2 statistics. p < 0.05 and I 2 > 50% indicated significant heterogeneity and used a random-effect model. Otherwise, a fixed-effect model was selected. Subgroup analysis was conducted based on the type of intervention measures and trial duration. Sensitivity analysis was conducted by "Leave-one-out" to assess the impact of each study on the effect size of meta-analysis [20], so as to test the robustness of the meta-analysis. Egger's test was used to evaluate publication bias.

Study Selection
A total of 104 relevant records were identified, and 71 records were obtained after removing duplicates. A total of 56 records were excluded by reading the title and abstract. A total of 15 records were further screened by reading the full text, of which, six records were ineligible for study design, two records were ineligible for outcomes, and one record was repeated. Finally, six studies were included in the overview of review [2,13,15,[21][22][23]. The flow diagram of the literature screening is displayed in Figure 1.

Data Synthesis and Analysis
We performed a descriptive analysis of the included SRs. The document management software EndNote 20 was used to import, screen, and manage documents and remove duplicates, and Excel 2021 was used to design data extraction tables and implement data statistics.
For the RCTs included in the SRs, STATA 17 was used for data synthesis. The standardized mean difference (SMD) was used for data synthesis for continuous variables because the trials used various measurement scales to measure the same outcomes and relative risk (RR) was used for data synthesis for binary variables. Heterogeneity was assessed using the Q-test and I 2 statistics. p < 0.05 and I 2 > 50% indicated significant heterogeneity and used a random-effect model. Otherwise, a fixed-effect model was selected. Subgroup analysis was conducted based on the type of intervention measures and trial duration. Sensitivity analysis was conducted by "Leave-one-out" to assess the impact of each study on the effect size of meta-analysis [20], so as to test the robustness of the metaanalysis. Egger's test was used to evaluate publication bias.

Study Selection
A total of 104 relevant records were identified, and 71 records were obtained after removing duplicates. A total of 56 records were excluded by reading the title and abstract. A total of 15 records were further screened by reading the full text, of which, six records were ineligible for study design, two records were ineligible for outcomes, and one record was repeated. Finally, six studies were included in the overview of review [2,13,15,[21][22][23]. The flow diagram of the literature screening is displayed in Figure 1.

Study Characteristics
The included studies were published between 2010 and 2021, including RCTs published until November 2020. Seven outcomes were reported, including TPO-Ab, Tg-Ab, TSH, free triiodothyronine (FT3), FT4, mood/wellbeing, immunomodulatory effects, and adverse effects. Five of the six studies conducted meta-analysis. The CCRBT was used in four studies, the Jadad scale was used in one study, and the quality assessment tool was not used in one article. The essential characteristics of the SRs are shown in Table 1.

Study Characteristics
The included studies were published between 2010 and 2021, including RCTs published until November 2020. Seven outcomes were reported, including TPO-Ab, Tg-Ab, TSH, free triiodothyronine (FT3), FT4, mood/wellbeing, immunomodulatory effects, and adverse effects. Five of the six studies conducted meta-analysis. The CCRBT was used in four studies, the Jadad scale was used in one study, and the quality assessment tool was not used in one article. The essential characteristics of the SRs are shown in Table 1. Jadad scale Selenium supplementation was related to the significant decrease in TPO-Ab levels at 6 and 12 months; at the same time, the Tg-Ab levels could decrease at 12 months.
After selenium supplementation, patients had an increased probability to improve their mood without obvious adverse events.
Toulis 2010 [22] 6(339) HT Selenium alone or in combination with LT4 Placebo alone or in combination with LT4 1 6 7 8 None Selenium supplementation was linked to a significant reduction in TPO-Ab levels at 3 months, as well as an improvement in mood and/or general wellbeing.

CCRBT
In the LT4-treated population, TPO-Ab and Tg-Ab levels in the selenium group decreased after 3, 6, 12 months, and 12 months, respectively; in the non-LT4-treated population, TPO-Ab levels in the selenium group decreased after 3 and 6 months and Tg-Ab levels decreased after 3 months. Based on the current evidence, there was insufficient justification for the new use of selenium supplementation in the treatment of AIT.

Assessment of Methodological Quality
The AMSTAR-2 results showed one study of high quality, one study of low quality, and four studies of critically low quality. The compliance status of each item is shown in Figure 2. The absence of a protocol for the study specified before the start of the review and the lack of a list of excluded documents and the reasons for their exclusion were the main critical flaws. The non-critical weakness was that the source of funding for the studies included in the review were not reported. The AMSTAR-2 results for each SR are shown in Table S1.
x FOR PEER REVIEW 6 of 14

Assessment of Methodological Quality
The AMSTAR-2 results showed one study of high quality, one study of low quality, and four studies of critically low quality. The compliance status of each item is shown in Figure 2. The absence of a protocol for the study specified before the start of the review and the lack of a list of excluded documents and the reasons for their exclusion were the main critical flaws. The non-critical weakness was that the source of funding for the studies included in the review were not reported. The AMSTAR-2 results for each SR are shown in Table S1.

Characteristics and Risk of Bias of RCTs
A total of 75 RCTs were included in six SRs. After excluding duplicate and ineligible trials, 23 RCTs were included for data synthesis . The detailed characteristics are documented in Table S2. Three trials had two intervention groups and two control groups that were divided into trials A and B based on whether they received LT4 treatment. One trial described that the outcome data were divided into groups with restored/non-restored thyroid function, which were combined into one [20]. A total of 23 RCTs including a total of 2292 patients, 89.7% of whom were women.
In random sequence generation, 19 trials reported the utilization of randomization, in which 13 trials did not report the specific randomized methods and were therefore assessed as unclear risk of bias; four trials did not report the use of randomization and were assessed as high risk of bias. Only one trial clarified the allocation concealment and was assessed as low risk of bias. A total of 12 trials did not specify the blinding of participants and personnel, so they were assessed as unclear risk of bias. Because most outcomes were objective antibody levels, which were unlikely to cause bias, 19 trials were assessed as low risk of bias in the blinding of outcome assessment. Three trials were assessed as unclear risk of bias in incomplete outcome data due to insufficient information. In selective reporting, one trial was assessed as high risk of bias due to the differences between the designs and results. Eight trials could not assess whether there was any other bias affecting the results because there was not enough information, so they were assessed as unclear risk of bias ( Figure 3). Figure S1 shows the bias risk for each RCT.

Characteristics and Risk of Bias of RCTs
A total of 75 RCTs were included in six SRs. After excluding duplicate and ineligible trials, 23 RCTs were included for data synthesis . The detailed characteristics are documented in Table S2. Three trials had two intervention groups and two control groups that were divided into trials A and B based on whether they received LT4 treatment. One trial described that the outcome data were divided into groups with restored/non-restored thyroid function, which were combined into one [20]. A total of 23 RCTs including a total of 2292 patients, 89.7% of whom were women.
In random sequence generation, 19 trials reported the utilization of randomization, in which 13 trials did not report the specific randomized methods and were therefore assessed as unclear risk of bias; four trials did not report the use of randomization and were assessed as high risk of bias. Only one trial clarified the allocation concealment and was assessed as low risk of bias. A total of 12 trials did not specify the blinding of participants and personnel, so they were assessed as unclear risk of bias. Because most outcomes were objective antibody levels, which were unlikely to cause bias, 19 trials were assessed as low risk of bias in the blinding of outcome assessment. Three trials were assessed as unclear risk of bias in incomplete outcome data due to insufficient information. In selective reporting, one trial was assessed as high risk of bias due to the differences between the designs and results. Eight trials could not assess whether there was any other bias affecting the results because there was not enough information, so they were assessed as unclear risk of bias ( Figure 3). Figure S1 shows the bias risk for each RCT.   (Figure 4b). The certainty in the evidence was low at 3 and 6 months and very low at 12 months ( Table 2). The sensitivity analysis showed that the association between selenium supplementation and TPO-Ab levels at 12 months in the non-LT4-treated group was fragile. When Nacamulli 2010 [28] was excluded, TPO-Ab levels decreased significantly and I 2 reduced to 0%. This was because the outcome data for this study were presented as the median and 95% CI, and the calculated mean and SD might not be reliable.     (Figure 4b). The certainty in the evidence was low at 3 and 6 months and very low at 12 months ( Table 2). The sensitivity analysis showed that the association between selenium supplementation and TPO-Ab levels at 12 months in the non-LT4-treated group was fragile. When Nacamulli 2010 [28] was excluded, TPO-Ab levels decreased significantly and I 2 reduced to 0%. This was because the outcome data for this study were presented as the median and 95% CI, and the calculated mean and SD might not be reliable.   (Figure 4b). The certainty in the evidence was low at 3 and 6 months and very low at 12 months ( Table 2). The sensitivity analysis showed that the association between selenium supplementation and TPO-Ab levels at 12 months in the non-LT4-treated group was fragile. When Nacamulli 2010 [28] was excluded, TPO-Ab levels decreased significantly and I 2 reduced to 0%. This was because the outcome data for this study were presented as the median and 95% CI, and the calculated mean and SD might not be reliable.

Change in Tg-Ab Levels
Eight trials, including 577 patients, evaluated changes in Tg-Ab levels in the LT4treated population. The meta-analysis results showed that Tg-Ab levels of the selenium group did not decrease significantly at 3 and 6 months (p > 0.05) (Figure 5a). The certainty in the evidence was low at 3 months and very low at 6 months ( Table 2). Two trials randomized and stratified the patients according to the baseline TPO-Ab levels, so the Tg-Ab levels at baseline were not comparable and were excluded from the meta-analysis [31,33]. The association between the selenium supplementation and Tg-Ab levels at 3 months in the LT4-treated population was also fragile. When Zhang 2013 [36] was excluded, the TG-Ab levels significantly reduced and I 2 reduced to 39%, which might be due to the lack of specificity of Tg-Ab in HT.
Ab levels significantly reduced and I reduced to 39%, which might be due to the lack of specificity of Tg-Ab in HT.

Adverse Effects
Eight trials, including 669 patients, evaluated the adverse effects. The most common adverse effect was gastric discomfort (10 in the selenium group and one in the control group); other adverse effects included hair loss (one in the selenium group and one in the control group), headache (one in the selenium group), skin rash (one in the selenium group), and hyperthyroidism (two in the control group). No serious adverse effects were observed. The was no statistically significant difference in the risk of adverse effects between the selenium and control groups (RR = 2.39, 95% CI: [0.93 to 6.11]; p > 0.05) ( Figure  6). The certainty in the evidence was low ( Table 2). Three RCTs reported glucose or HbA1c [26,27,34], and the results showed that there were no significant differences in glucose or HbA1c concentrations between the selenium and control groups.

Adverse Effects
Eight trials, including 669 patients, evaluated the adverse effects. The most common adverse effect was gastric discomfort (10 in the selenium group and one in the control group); other adverse effects included hair loss (one in the selenium group and one in the control group), headache (one in the selenium group), skin rash (one in the selenium group), and hyperthyroidism (two in the control group). No serious adverse effects were observed. The was no statistically significant difference in the risk of adverse effects between the selenium and control groups (RR = 2.39, 95% CI: [0.93 to 6.11]; p > 0.05) ( Figure 6). The certainty in the evidence was low ( Table 2). Three RCTs reported glucose or HbA1c [26,27,34], and the results showed that there were no significant differences in glucose or HbA1c concentrations between the selenium and control groups.

Publication Bias
Egger's test indicated no publication bias in the TPO-Ab levels at 3 months in the LT4-treated population (Egger's test, p = 0.755). The publication bias of other results could not be evaluated because the meta-analysis of the other results did not include more than 10 trials. Figure 6. Difference of adverse effects between the selenium group and control group. The black block represents the effect sizes of individual studies, red dashed line represents combined effect sizes, blue diamond block represents the 95%CI of combined effect sizes [25][26][27]30,31,35,41].

Publication Bias
Egger's test indicated no publication bias in the TPO-Ab levels at 3 months in the LT4-treated population (Egger's test, p = 0.755). The publication bias of other results could not be evaluated because the meta-analysis of the other results did not include more than 10 trials.

Summary of Findings
The GRADE system was used to grade the certainty of evidence for each outcome and a summary of findings table was constructed ( Table 2). The outcomes of TPO-Ab and Tg-Ab levels were both downgraded due to their indirectness, as they involved surrogate markers for clinical efficacy or disease progression [47].

Discussion
Our study included SRs before 10 June 2022 and the RCTs included in SRs were published until November 2020. Of the six included studies, two were published in the last 5 years, which showed that AIT was receiving increasing attention and there was more evidence for selenium supplementation to treat AIT. However, different evidence led to different conclusions. Therefore, this study conducted an overview of the reviews on selenium supplementation for treating AIT to evaluate the quality of the existing studies and further summarize the current evidence, providing a reference for clinical practice.
According to the findings of AMSTAR-2, the methodological quality of the SRs included in this study was not good, with only one study of high quality, one study of low quality, and four studies of critically low quality. As for critical items, in Item 2, only two studies specified the research protocol before the SRs began. The registration of the SR before the research can not only reduce the risk of bias and improve the report quality but also save research resources [48]. Other SR authors could determine whether the research was repeated by searching the registration platform. In Item 4, five studies partially met the requirements of the comprehensive literature search strategy but did not fully meet the requirements due to the lack of supplemented retrieval by reviewing the reference list from the studies found, the lack of searching relevant gray literature, the lack of a complete search strategy, and other reasons. Other researchers might not be able to reproduce the search results, reducing the reliability of the results. In Item 7, five studies failed to provide a complete list of excluded studies, which might cause omission when screening the literature. For non-critical items, five studies in Item 10 did not provide the funding source for the included studies, which might ignore some risks of bias. The above were the main items that affected the methodological quality included in this study; future research should focus on the quality of methodology [49].
In addition, we extracted and synthesized the data from RCTs included in SRs. The CCRBT showed that the current randomized controlled trials for selenium therapy in AIT could still have improved study design. A total of 19 trials reported the use of randomization, but 13 did not specify the specific randomized method. Only one trial clarified the allocation concealment. This also led to a downgrade in the risk of bias in the GRADE system. In the future, RCTs can utilize CCRBT to improve the quality of research.
According to previous experience, receiving LT4 treatment might affect the outcome, so this study divided the patients into two groups according to whether they received LT4 treatment for data synthesis [21]. The results showed that in the LT4-treated population, the TPO-Ab levels in the selenium group decreased at 3 and 6 months but the Tg-Ab levels did not decrease. In the non-LT4-treated population, the TPO-Ab levels in the selenium group decreased at 3 and 6 months and did not decrease at 12 months. The Tg-Ab levels decreased at 3 and 6 months and did not decrease at 12 months. The decreases in TPO-Ab levels in the LT4-treated population and the non-LT4-treated population were consistent with the meta-analysis results of Qiu 2020 et al. [23]. The insignificant decrease in Tg-Ab levels might be due to the lack of specificity in HT [36]. It is worth noting that the association between TPO-Ab levels at 12 months in the non-LT4-treated population and Tg-Ab levels at 3 months in the LT4-treated population and selenium supplementation was not reliable.
The selenium treatment and control groups showed no significant differences in adverse effects, and no serious adverse effects were observed. This showed that selenium supplementation was a safe and effective treatment for AIT. This result was consistent with the results of Fan 2014 et al. [15] but inconsistent with the results of Qiu 2020 et al. [23]. These two studies were based on two and five trials, respectively, whereas our study was based on more RCTs, so the results might be more reliable. Most of the selenium supplementation used in the trials was 200 µg/day, and 200 µg of selenomethionine was equivalent to 80 µg of selenium [24]. According to relevant studies, the recommended intake dose of selenium is 55 µg/day and the tolerable upper limit is 400 µg/day [50]. The current selenium dose used in the trials was reasonable. Some studies have shown that high levels of selenium intake are associated with the development of diabetes [51]. In the studies we included, no such trend was observed. In addition, the selenium supplementation selected in the current RCTs was mainly in the form of selenium salts (sodium selenite), amino acids (selenomethionine), and selenium yeast. There was a new generation of selenium supplements, including zerovalent selenium nanoparticles and selenized polysaccharides, that had the advantages of low toxicity, high bioavailability, and controlled release [52]. They could be considered for use in future research.
The certainty of evidence in the GRADE system showed that the outcomes for TPO-Ab and Tg-Ab levels were both graded as low or very low; only the difference in Tg-Ab levels at 3 months in the non-LT4-treated population was of moderate certainty. Adverse effects were graded as low certainty. This indicated that there might be some differences between the current results and the real situation and that the current results need to be treated with caution. Further research needs to be carried out in the future.
This study had some limitations as well. Limited by language barriers, we were only able to search English databases. Due to the characteristics of the study design, even recently published SRs were unlikely to be included the latest literature, so our study omitted this literature after November 2020. In addition, the meta-analysis of the RCTs showed significant heterogeneity. Firstly, the study included various types of AIT disease rather than focusing on one specific type, resulting in some clinical heterogeneity. Secondly, different selenium preparations and doses posed challenges in comparing different studies. Thirdly, the study estimated some results using the median and IQR and the median and 95% CI, which may not be reliable and could impact the results.

Conclusions
Although selenium supplementation could reduce the TPO-Ab levels at 3 and 6 months and the Tg-Ab levels at 3 and 6 months in the non-LT4-treated population, the routine use of selenium supplementation in patients with AIT is not recommended due to the low certainty of evidence. In current clinical practice, selenium supplementation beyond the support of evidence-based medical evidence should be corrected. Since low selenium status is closely related to many diseases, it is feasible to supplement selenium only in patients with selenium deficiency. In the future, it is expected that RCTs with rigorous design, long-term follow-up, and using the new generation of selenium supplementation will offer high-quality evidence to inform clinical decision making.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/nu15143194/s1, Table S1: The results of AMSTAR-2 methodological quality assessment for each SR. Table S2: The essential characteristics of RCTs included in metaanalysis. Figure S1: Risk of bias summary. References

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author upon reasonable request.