3.1. The Data and Our Earlier Results
The GSE190747 data contain (1) 16 female COVID-19-convalescent octogenarians and their gene expression values collected at days 0, 1, and 7; and (2) 14 COVID-19-naïve individuals (ages from 26 to 72) and their gene expression values collected at days 0, 10, 11, 34, 35, 40, 41, 48.
We, here, briefly report our earlier results [
26] that established the existence of genomic signature patterns and COVID-19 subtypes and the mathematical and biological equivalence of the disease and the signature patterns. The work used max-linear competing logistic regression models to establish component classifiers CF-i and the combined max classifier CFmax. The following
Table 1 and
Table 2 appeared in our earlier work [
26].
For the GSE157103 data, we also found that a combination of CDC6 and ZNF282 can lead to 97.62% accuracy (98% sensitivity, 96.15% specificity), with the following classifier: 1.7615 + 6.8226 × CDC6 −1.1556 × ZNF282; a combination of CDC6, ZNF282, and CEP72 (centrosomal protein 72) can lead to 98.41% accuracy (99% sensitivity, 96.15% specificity), with the following additional classifier: −1.9944 + 7.4106 × CDC6 −7.1401 × CEP72.
As the pandemic is now dominated by Omicron variants, especially BA.5, linking the genes identified earlier for other variants to Omicron variants will provide better genomic knowledge of COVID-19 diseases. We found that the genes identified from GSE157103 and GSE152418 again led to 100% accuracy with a SARS-CoV-2 Omicron variant BA.1 cohort study GSE201530 [
37]. The following new
Table 3 reports the outcomes.
The data from GSE201530 contains four types: unvaccinated/no prior infection, vaccinated/no prior infection, unvaccinated/prior infection, and vaccinated/prior infection. Comparing
Table 3 with
Table 1 and
Table 2, we can see different patterns in fitted coefficients associated with the chosen genes, which is not surprising, as COVID-19 patients in two previous cohorts (GSE157103 and GSE152418) were first-time infections and had no vaccinations, i.e., GSE157103 and GSE152418 have different group comparisons to GSE201530. An essential feature in
Table 1 and
Table 2 is that the signs and strengths of fitted coefficients are interpretable, i.e., they tell how the expression level changes of the biomarker genes affect the risk of COVID-19 infection and their functional effects. However, the genes identified from our earlier work still lead to 100% accuracy in
Table 3, which shows that these genes contain information related to SARS-CoV-2 variants, including Omicron BA.1, and likely BA.5 (once the data are available to check).
As discussed in our earlier work [
27], genes in
Table 4 and
Table 5 and their transcriptional response and functional effects on SARS-CoV-2 and genes in
Table 1,
Table 2 and
Table 3 and their functional signature patterns to COVID-19 antibodies are significantly different, which can be interpreted as the former being the point of a phenomenon, and the latter being the essence of the disease. Such significant findings can help explore the causal and pathological clues between SARS-CoV-2 and COVID-19 disease and fight against the disease with more targeted vaccines, antiviral drugs, and therapies. Putting
Table 1,
Table 2,
Table 3,
Table 4 and
Table 5 together serves as a starting point for our new comparative vaccine efficacy study in the subsequent sections.
Given the perfect performance of genes (ABCB6 (ATP Binding Cassette Subfamily B Member 6 (Langereis Blood Group)) KIAA1614, MND1, SMG1 (nonsense-mediated mRNA decay associated PI3K related kinase), and RIPK3 (Receptor Interacting Serine/Threonine Kinase 3)) in
Table 1,
Table 2 and
Table 3 using blood-sample data and the nearly perfect performance of genes CDC6, ZNF282, and CEP72, these genes certainly can be used as reliable biomarkers for COVID-19 diseases (blood samples). On the other hand, ATP6V1B2 (ATPase H+ Transporting V1 Subunit B2) and IFI27 (Interferon Alpha Inducible Protein 27) have central roles in SARS-CoV-2 heterogeneous populations, which was discussed in our earlier work [
27] and is further confirmed in the new
Table 5 using NP/OP swab PCR samples. Therefore, considering the functions of these genes discussed in our earlier work [
26,
27], we focus on the genes MND1, SMG1, CDC6 (cell division cycle 6), ZNF282, CEP72, ATP6V1B2, and IFI27 in this study using the data of BNT162b2 vaccine efficacy [
32].
3.2. The Clinic Evidence Directly Observed Using Graphical Approach and Results
In this section, we directly plot gene expression change responses to the BNT162b2 vaccine. Using the genes identified in our earlier work and the last section as COVID-19 biomarkers, in the subsequent figures, we plot MND1, SMG1, CDC6, ZNF282, CEP72, ATP6V1B2, IFI27 responses in
Figure 1,
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6, and
Figure 7, respectively.
Figure 1,
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6 and
Figure 7 clearly show that COVID-19-convalescent individuals and COVID-19-naïve individuals have entirely different BNT162b2 vaccine responses. Our earlier work [
27] used a total of fourteen cohort studies (including different platforms, different ethics, different geographical regions, breakthrough infections, and Omicron variants) with 1481 samples to justify the results. So far, we have not seen any other research in the literature with nearly perfect performance. With such comprehensive studies and conclusive outcomes, it may be safe to say that the identified genes in our earlier work are representative, and the gene–gene interaction heterogeneity between SARS-CoV-2 and COVID-19 does exist. Using the results from our earlier work [
26,
27] and those in
Section 3.1, we can see that the higher the expression values of ZNF282, CPE72, and ATP6V1B2, the lower the risk of an individual being COVID-19 positive; and the lower the expression values of MND1, CDC6, and IFI27, the lower the risk of an individual being COVID-19 positive. Clearly, COVID-19-naïve individuals have improved and expected vaccine responses, i.e., the BNT162b2 vaccine can help prevent SARS-CoV-2 infections for COVID-19-naïve individuals.
Note that at time zero, except for SMG1 and CEP72, the two groups are comparable in terms of their expression values. However, COVID-19-convalescent octogenarians showed adverse effects with other genes, i.e., the BNT162b2 vaccine could increase the risk of breakthrough SARS-CoV-2 infections in this group of individuals.
Figure 2 shows different patterns of SMG1 expression level changes. In our earlier work [
25,
26], we found that this mRNA gene can be either helpful or harmful depending on its combination effects with other genes. The right panel also shows that the vaccine can be either helpful or harmful depending on its effect time.