Protein Interaction Network Analysis to Investigate Stress Response, Virulence, and Antibiotic Resistance Mechanisms in Listeria monocytogenes

Listeria monocytogenes is a deadly and costly foodborne pathogen that has a high fatality rate in the elderly, pregnant women, and people with weakened immunity. It can survive under various stress conditions and is a significant concern for the food industry. In this work, a data analysis approach was developed with existing tools and databases and used to create individual and combined protein interaction networks to study stress response, virulence, and antimicrobial resistance and their interaction with L. monocytogenes. The networks were analyzed, and 28 key proteins were identified that may serve as potential targets for new strategies to combat L. monocytogenes. Five of the twenty-eight proteins (i.e., sigB, flaA, cheA, cheY, and lmo0693) represent the most promising targets because they are highly interconnected within the combined network. The results of this study provide a new set of targets for future work to identify new strategies to improve food preservation methods and treatments for L. monocytogenes.


Introduction
The Centers for Disease Control and Prevention (CDC) estimate that there are approximately 48 million cases of foodborne illnesses per year in the United States. Listeriosis, while not common, is one of the leading causes of death from foodborne illnesses [1]. In the U.S., there are approximately 1600 infections per year that result in about 260 deaths, corresponding to a hospitalization rate of 94% and a mortality rate of 16% [2]. The fatality rate can be as high as 30% in the elderly, pregnant women, and people with weakened immunity [3]. Listeria monocytogenes, the pathogen that causes listeriosis, has the third highest mortality rate for foodborne pathogens in the U.S.
Listeria monocytogenes is a facultative intracellular pathogen that can survive a wide range of stress conditions [4]. It has been found to be a highly occurring pathogen in several countries, including the United States, United Kingdom, Australia, Canada, and Mexico [5]. It is found in the environment and is carried by animals [6]; humans are primarily infected with the bacteria from contaminated foods and surfaces [6]. For these reasons, L. monocytogenes is one of the most concerning pathogens for the food industry [7].
Bacterial stress response is a microorganism's ability to respond to external stresses by expressing proteins that aid in survival. Food preservation methods control the presence and growth of bacteria in the food chain by employing different types of stresses, e.g., thermal stress, acidic stress, osmotic stress, and oxidative stress [4]. Listeria monocytogenes is difficult to control in foods because it can survive in low moisture, high salt concentrations, and refrigerated conditions [8]. Understanding the key stress response proteins can lead to the development of more effective food preservation methods and thereby reduce the risk of exposure to L. monocytogenes.
Virulence is a microorganism's ability to cause disease through the expression of virulence factors, i.e., proteins, that help bacteria to invade host cells, evade host defenses, and cause diseases [9]. These include polysaccharide capsules that surround the outside of the pathogen to protect it; surface components such as flagella (protein appendages) that propel the pathogen to move within a host cell; adhesions (extracellular-bound proteins) that enable the pathogen to interact with a host cell; exotoxins and enterotoxins secreted by the pathogen; and Type III secretion systems (an assemblage of proteins) that help the pathogen to secrete proteins into the host cell [10]. For example, in Listeria monocytogenes, the proteins plcA and hly are known virulence factors that aid in the escape of the microorganism from the host cell vacuole [11]. Disrupting the expression of virulence factors could lead to fewer infections and better outcomes for patients who are exposed to L. monocytogenes.
Antibiotic resistance is a microorganism's ability to defeat the drugs designed to kill it [12]. Antibiotic resistance is a serious public health issue that is estimated to be a leading cause of death worldwide after stroke and heart disease [13]. In 2019, the CDC reported nearly three million infections and more than 35,000 deaths due to resistant microorganisms [12]. In Europe, such infections were responsible for more than 426,000 illnesses and 33,000 deaths in 2019 [14]. Listeria monocytogenes is susceptible to a wide range of antibiotics active against Gram-positive bacteria, except cephalosporins and fosfomycin, for which it has inherent resistance [15]. The most common treatment for listeriosis is ampicillin, used alone or in conjunction with gentamicin [15]. Although the presence of ampicillin-resistant genes is not yet observed to be increasing in L. monocytogenes [16], this is an ongoing risk due to lateral gene transfer in bacteria [17]. The identification of new targets to combat antibiotic resistance will ensure that effective treatments continue to be available for infected patients.
Previous studies have looked at the relationships between stress response and virulence. For example, sigB is known to play a role in both stress response and virulence [18]. It has also been shown that there can be an interaction between virulence and antibiotic resistance. For example, Listeria monocytogenes can be susceptible to fosfomycin, despite having intrinsic resistance, due to the expression of the virulence genes prfA and hly [19]. In this work, stress response, virulence, and antibiotic resistance are studied together. First, a method is described to generate protein interaction networks using readily available tools and resources in systems biology. The method is used to create individual and combined protein interaction networks for stress response, virulence, and antimicrobial resistance for L. monocytogenes. Lastly, the networks are analyzed to identify key proteins.

Materials and Methods
A data analysis approach was developed with existing tools and databases to create protein interaction networks. The first step is to generate a list of proteins related to the biological process of interest, e.g., stress response, virulence, and antibiotic resistance. There are several databases available that can be used to generate protein lists, including but not limited to Genemania [20], DisGeNet [21], UniProt [22], and Gene Expression Omnibus (NCBI-GEO) [23]. From the protein list, a network is created and visualized using the tools STRING and Cytoscape. STRING is a database of protein-protein interactions [24] and Cytoscape is a multi-platform network visualization and analysis tool [25]. Both Cytoscape and STRING have been previously used successfully for network development and analysis, for example, to create a gene interaction network to study antibiotic resistance mechanisms in Proteus mirabilis [26].
In a protein interaction network the nodes correspond to proteins and the edges correspond to known or predicted protein interactions. The network can be manually curated to add or remove nodes and edges based on published results or other criteria, such as clustering analysis. Generally, nodes that are not highly interconnected may be removed. The network is analyzed based on the topological features of the network to identify key nodes. The functions of the proteins that correspond to the key nodes can be further studied using tools such as the functional enrichment analysis in STRING and the online resource DAVID (Database for Annotation, Visualization, and Integrated Discovery).
DAVID provides a comprehensive set of functional annotation tools for investigators to understand the biological meaning behind large lists of genes [27].
The general workflow to create a protein interaction network is summarized in Figure 1. The specific workflow used for this work can be provided upon request.
The network is analyzed based on the topological features of the network to identify key nodes. The functions of the proteins that correspond to the key nodes can be further studied using tools such as the functional enrichment analysis in STRING and the online resource DAVID (Database for Annotation, Visualization, and Integrated Discovery).
DAVID provides a comprehensive set of functional annotation tools for investigators to understand the biological meaning behind large lists of genes [27].
The general workflow to create a protein interaction network is summarized in Figure 1. The specific workflow used for this work can be provided upon request. STRING is a database of known and predicted protein-protein interactions that includes both physical and functional protein associations. The STRING database currently covers 24,584,628 proteins from 5090 organisms [24]. STRING generates a network from an input list of proteins based on associations from a variety of data sources including genomic context predictions, high-throughput lab experiments, automated text mining, and previous knowledge in databases [24]. The network can be viewed within STRING or exported for visualization and analysis outside of STRING; for example, the network can be exported directly to Cytoscape.
Cytoscape is a software platform for visualizing complex networks and integrating attribute data [28]. A network can be imported into Cytoscape from a variety of sources. In addition, a network can be generated within Cytoscape. For example, various types of queries can be performed using the STRING application in Cytoscape to generate a protein list and protein network. The functionality of Cytoscape can be extended through a wide range of applications supporting a variety of problem domains that can be downloaded and managed directly in the software.
For this analysis, STRING and Cytoscape were used to generate and visualize protein networks for stress response, virulence, and antibiotic resistance. Three individual protein-protein interaction networks were generated in Cytoscape. The STRING: PubMed  STRING is a database of known and predicted protein-protein interactions that includes both physical and functional protein associations. The STRING database currently covers 24,584,628 proteins from 5090 organisms [24]. STRING generates a network from an input list of proteins based on associations from a variety of data sources including genomic context predictions, high-throughput lab experiments, automated text mining, and previous knowledge in databases [24]. The network can be viewed within STRING or exported for visualization and analysis outside of STRING; for example, the network can be exported directly to Cytoscape.
Cytoscape is a software platform for visualizing complex networks and integrating attribute data [28]. A network can be imported into Cytoscape from a variety of sources. In addition, a network can be generated within Cytoscape. For example, various types of queries can be performed using the STRING application in Cytoscape to generate a protein list and protein network. The functionality of Cytoscape can be extended through a wide range of applications supporting a variety of problem domains that can be downloaded and managed directly in the software.
For this analysis, STRING and Cytoscape were used to generate and visualize protein networks for stress response, virulence, and antibiotic resistance. Three individual proteinprotein interaction networks were generated in Cytoscape. The STRING: PubMed query function was used to generate the initial protein lists for each network. The settings used to generate the protein networks are listed in Table 1. There are several options for the data source in Cytoscape. "STRING: PubMed query" was selected to return a STRING network based on a protein list generated from a PubMed query with the specified search term for each network. This resulted in a list of proteins from the PubMed database by using the specified search term, creating the protein interaction network using the STRING database, and displaying the network in Cytoscape. STRING has two species options for Listeria monocytogenes. Listeria monocytogenes EGDe was selected because it is a commonly used laboratory reference strain [29].
There were no changes made to the default settings for the STRING parameters. "Full STRING network" was selected because it returns both functional and physical protein associations. STRING ranks associations from lowest to highest based on the strength of the supporting data. For this analysis, the confidence score cutoff was set to 0.4, which returns associations that are of a medium-to-highest confidence score. The maximum number of proteins was set to 300. These two settings were selected to ensure that the initial networks included a large number of proteins for the subsequent analysis. In all three cases, the maximum number of proteins, i.e., 300 proteins, was identified and used to create the initial network.
The application Molecular Complex Detection (MCODE) is a clustering algorithm that identifies densely connected regions in a protein interaction network that may represent molecular complexes [30]. The MCODE application was used within Cytoscape to manually curate the networks by removing nodes that were not part of a cluster. The settings used for MCODE are listed in Table 2. There were no changes made to the default settings in MCODE. Loops were not included in the neighborhood density calculation. The degree cutoff was set to 2, meaning only nodes with two or more connections would be scored. The haircut option was selected so that nodes connected to a cluster by only one edge were removed. Fluff was set to No, ensuring that nodes would only belong to one cluster. The node score cutoff, which determines which nodes to include in a cluster, was set to 0.2. This setting can also be adjusted after the results are generated to change the cluster size. The K-score which determines the minimum number of connections within a cluster was set to 2, resulting in clusters with two or more connections. The maximum depth was set to 100 to avoid arbitrarily limiting the cluster size.
The application CytoHubba scores the nodes within a network based on its topological characteristics [31]. There are two settings in CytoHubba: one to specify the number of nodes to be ranked and one to identify the ranking method. There are eight algorithms available in CytoHubba that can be used to rank the nodes based on various features of the network: MCC (Maximal Clique Centrality), DMNC (Density of Maximum Neighborhood Component), MNC (Maximum Neighborhood Component), Degree, EPC (Edge Percolated Component), Bottleneck, EcCentricity, and Closeness. CytoHubba was used to identify the most highly connected nodes for each network. For this analysis, the top 25 nodes were ranked to avoid arbitrarily limiting the number of nodes returned. The MCC algorithm was used as the ranking method based on prior work that determined that MCC identified more essential proteins compared to the other methods [31].

Stress Response Network
The full stress response network that was generated resulted in 300 nodes with 1188 edges. The full network is included in the supplemental document Table S1. MCODE analysis identified fourteen clusters ranging in size (from three nodes to forty-four nodes), with a total of one hundred thirty-seven nodes among all the clusters. These nodes were used to generate a reduced STRING network with 137 nodes and 599 edges. The nodes included in the clustered network are also identified in Table S1. CytoHubba was used to rank the top 25 nodes using the MCC algorithm, as discussed in the Methods section. Figure 2 shows the top 25 nodes in a radial layout, with colors indicating each node's rank; red corresponds to the highest-ranked nodes while yellow corresponds to the lowest. There were no changes made to the default settings in MCODE. Loops were not included in the neighborhood density calculation. The degree cutoff was set to 2, meaning only nodes with two or more connections would be scored. The haircut option was selected so that nodes connected to a cluster by only one edge were removed. Fluff was set to No, ensuring that nodes would only belong to one cluster. The node score cutoff, which determines which nodes to include in a cluster, was set to 0.2. This setting can also be adjusted after the results are generated to change the cluster size. The K-score which determines the minimum number of connections within a cluster was set to 2, resulting in clusters with two or more connections. The maximum depth was set to 100 to avoid arbitrarily limiting the cluster size.
The application CytoHubba scores the nodes within a network based on its topological characteristics [31]. There are two settings in CytoHubba: one to specify the number of nodes to be ranked and one to identify the ranking method. There are eight algorithms available in CytoHubba that can be used to rank the nodes based on various features of the network: MCC (Maximal Clique Centrality), DMNC (Density of Maximum Neighborhood Component), MNC (Maximum Neighborhood Component), Degree, EPC (Edge Percolated Component), Bottleneck, EcCentricity, and Closeness. CytoHubba was used to identify the most highly connected nodes for each network. For this analysis, the top 25 nodes were ranked to avoid arbitrarily limiting the number of nodes returned. The MCC algorithm was used as the ranking method based on prior work that determined that MCC identified more essential proteins compared to the other methods [31].

Stress Response Network
The full stress response network that was generated resulted in 300 nodes with 1188 edges. The full network is included in the supplemental document Table S1. MCODE analysis identified fourteen clusters ranging in size (from three nodes to forty-four nodes), with a total of one hundred thirty-seven nodes among all the clusters. These nodes were used to generate a reduced STRING network with 137 nodes and 599 edges. The nodes included in the clustered network are also identified in Table S1. CytoHubba was used to rank the top 25 nodes using the MCC algorithm, as discussed in the Methods section. Figure 2 shows the top 25 nodes in a radial layout, with colors indicating each node's rank; red corresponds to the highest-ranked nodes while yellow corresponds to the lowest.  Table S1 for additional information about each protein).
The elbow method was used to identify the breakpoint in the scores for the top 25 nodes. There were seven nodes that had the highest scores according to the MCC algorithm. The proteins corresponding to these nodes are groEL, dnaK, clpP, lmo1138, grpE, dnaJ, and groES. More details about these proteins are included in Appendix A. Table A1 summarizes the functions of these proteins. Table A2 shows the STRING functional enrichment annotations for the highest-ranked nodes. The Gene Ontological (GO) terms show that the stress response network enriched the genes related to molecular functions and biological processes and the KEGG pathway related to RNA degradation.

Virulence Protein Interaction Network
The full virulence network that was generated resulted in 300 nodes with 1544 edges. The full network is included in the supplemental document Table S2. MCODE analysis identified fourteen clusters ranging in size (from three nodes to thirty-three nodes), with a total of one hundred forty-three nodes among all the clusters. These nodes were used to generate a reduced STRING network with 143 nodes and 840 edges. The nodes included in the clustered network are also identified in Table S2. CytoHubba was used to rank the top 25 nodes using the MCC algorithm. Figure 3 shows the top 25 nodes in a radial layout, with colors indicating each node's rank; red corresponds to the highest-ranked nodes and yellow corresponds to the lowest.  Table S1 for additional information about each protein).
The elbow method was used to identify the breakpoint in the scores for the top 25 nodes. There were seven nodes that had the highest scores according to the MCC algorithm. The proteins corresponding to these nodes are groEL, dnaK, clpP, lmo1138, grpE, dnaJ, and groES. More details about these proteins are included in Appendix A. Table A1 summarizes the functions of these proteins. Table A2 shows the STRING functional enrichment annotations for the highest-ranked nodes. The Gene Ontological (GO) terms show that the stress response network enriched the genes related to molecular functions and biological processes and the KEGG pathway related to RNA degradation.

Virulence Protein Interaction Network
The full virulence network that was generated resulted in 300 nodes with 1544 edges. The full network is included in the supplemental document Table S2. MCODE analysis identified fourteen clusters ranging in size (from three nodes to thirty-three nodes), with a total of one hundred forty-three nodes among all the clusters. These nodes were used to generate a reduced STRING network with 143 nodes and 840 edges. The nodes included in the clustered network are also identified in Table S2. CytoHubba was used to rank the top 25 nodes using the MCC algorithm. Figure 3 shows the top 25 nodes in a radial layout, with colors indicating each node's rank; red corresponds to the highest-ranked nodes and yellow corresponds to the lowest.  Table S2 for additional information about each protein).
The elbow method was used to identify the breakpoint in the scores for the top 25 nodes. There were 10 nodes that had the highest scores according to the MCC algorithm. The proteins corresponding to these nodes are sigB, cheA, flaA, fliI, flgL, fliP, motB, cheY, lmo0681, and lmo0693. More details about these proteins are provided in Appendix B. Table A3 summarizes the function of each protein. Table A4 shows the STRING functional enrichment annotations for the highest-ranked nodes. The GO terms show that the  Table S2 for additional information about each protein).
The elbow method was used to identify the breakpoint in the scores for the top 25 nodes. There were 10 nodes that had the highest scores according to the MCC algorithm. The proteins corresponding to these nodes are sigB, cheA, flaA, fliI, flgL, fliP, motB, cheY, lmo0681, and lmo0693. More details about these proteins are provided in Appendix B. Table A3 summarizes the function of each protein. Table A4 shows the STRING functional enrichment annotations for the highest-ranked nodes. The GO terms show that the virulence network enriched genes related to biological processes, cellular components, and molecular function and the KEGG pathways related to flagellar assembly and bacterial chemotaxis.

Antibiotic Resistance Protein Interaction Network
The full antibiotic resistance network that was generated resulted in 300 nodes with 1771 edges. The full network is included in the supplemental document Table S3. MCODE analysis identified sixteen clusters ranging in size (from three nodes to twenty-five nodes), with a total of one hundred fifty-seven nodes among all the clusters. These nodes were used to generate a reduced STRING network with 157 nodes and 1071 edges. The nodes included in the clustered network are also identified in Table S3. CytoHubba was used to rank the top 25 nodes using the MCC algorithm. Figure 4 shows the top 25 nodes in a radial layout, with colors indicating each node's rank; red corresponds to the highest-ranked nodes and yellow corresponds to the lowest. virulence network enriched genes related to biological processes, cellular components, and molecular function and the KEGG pathways related to flagellar assembly and bacterial chemotaxis.

Antibiotic Resistance Protein Interaction Network
The full antibiotic resistance network that was generated resulted in 300 nodes with 1771 edges. The full network is included in the supplemental document Table S3. MCODE analysis identified sixteen clusters ranging in size (from three nodes to twenty-five nodes), with a total of one hundred fifty-seven nodes among all the clusters. These nodes were used to generate a reduced STRING network with 157 nodes and 1071 edges. The nodes included in the clustered network are also identified in Table S3. CytoHubba was used to rank the top 25 nodes using the MCC algorithm. Figure 4 shows the top 25 nodes in a radial layout, with colors indicating each node's rank; red corresponds to the highestranked nodes and yellow corresponds to the lowest.  Table S3 for additional information about each protein).
The elbow method was used to identify the breakpoint in the scores for the top 25 nodes. There were 17 nodes that had the highest scores according to the MCC algorithm. The proteins corresponding to these nodes are sigB, flaA, cheA, cheY, lmo0693, fliM, lmo0700, flgB, flgC, fliG, fliH, lmo0698, fliD, flhB, flhA, flgK, and flgL. More details about these proteins are included in Appendix C. Table A5 summarizes the function of these  Table S3 for additional information about each protein).
The elbow method was used to identify the breakpoint in the scores for the top 25 nodes. There were 17 nodes that had the highest scores according to the MCC algorithm. The proteins corresponding to these nodes are sigB, flaA, cheA, cheY, lmo0693, fliM, lmo0700, flgB, flgC, fliG, fliH, lmo0698, fliD, flhB, flhA, flgK, and flgL. More details about these proteins are included in Appendix C. Table A5 summarizes the function of these proteins. Table A6 shows the STRING functional enrichment annotations for the highestranked nodes. The GO terms show that the antibiotic resistance network enriched genes related to biological processes, cellular components, and molecular function and the KEGG pathways related to flagellar assembly and bacterial chemotaxis.

Combined Protein Interaction Network
A combined network was generated using the nodes from the top clusters in each of the individual networks. The three individual clustered networks contained one hundred seventy-two unique proteins. The Venn diagram in Figure 5 shows a breakdown of the number of nodes from each individual network that were used to create the combined network.
proteins. Table A6 shows the STRING functional enrichment annotations for the highestranked nodes. The GO terms show that the antibiotic resistance network enriched genes related to biological processes, cellular components, and molecular function and the KEGG pathways related to flagellar assembly and bacterial chemotaxis.

Combined Protein Interaction Network
A combined network was generated using the nodes from the top clusters in each of the individual networks. The three individual clustered networks contained one hundred seventy-two unique proteins. The Venn diagram in Figure 5 shows a breakdown of the number of nodes from each individual network that were used to create the combined network. The combined network has 172 nodes and 1429 edges. The full network is included in the supplemental document Table S4. CytoHubba was used to rank the top 25 nodes using the MCC algorithm. Figure 6 shows the top 25 nodes in a radial layout, with colors indicating each node's rank; red corresponds to the highest-ranked nodes and yellow corresponds to the lowest. The combined network has 172 nodes and 1429 edges. The full network is included in the supplemental document Table S4. CytoHubba was used to rank the top 25 nodes using the MCC algorithm. Figure 6 shows the top 25 nodes in a radial layout, with colors indicating each node's rank; red corresponds to the highest-ranked nodes and yellow corresponds to the lowest.  Table S4 for additional information about each protein).
The elbow method was used to identify the breakpoint in the scores for the top 25 nodes. There were 21 nodes that had the highest scores according to the MCC algorithm. The proteins corresponding to these nodes are cheA, flgB, flgC, fliG, fliI, motB, flaA, cheY, fliM, lmo0693, lmo0700, flgK, flgL, flhA, flhB, fliD, fliP, lmo0681, lmo0698, sigB, and fliH.  Table S4 for additional information about each protein). The elbow method was used to identify the breakpoint in the scores for the top 25 nodes. There were 21 nodes that had the highest scores according to the MCC algorithm. The proteins corresponding to these nodes are cheA, flgB, flgC, fliG, fliI, motB, flaA, cheY, fliM, lmo0693, lmo0700, flgK, flgL, flhA, flhB, fliD, fliP, lmo0681, lmo0698, sigB, and fliH. More details about these proteins are included in Appendix D. Table A7 summarizes the function of the protein corresponding to each of these nodes. It also specifies in which individual networks each of the nodes is present. For example, two of the top twenty-one nodes, sigB and flaA, are present in each of the three individual networks. Table A8 shows the STRING functional enrichment annotations for the highest-ranked nodes. The GO terms show that the combined network enriched genes related to biological processes, cellular components, and molecular function and the KEGG pathways related to flagellar assembly and bacterial chemotaxis.

Discussion
This study outlines a method to generate protein interaction networks using readily available tools and resources. Three individual networks and a combined network were created for stress response, virulence, and antibiotic resistance processes in Listeria monocytogenes. Each network was analyzed to identify the most highly interconnected proteins and their functions, and the results were as follows. For the stress response network, the key proteins are groEK, dnaK, lmo1138, clpP, grpE, dnaJ, and groES. The functions of these proteins are summarized in Table A1 and the functional enrichment analysis from STRING is summarized in Table A2. All these proteins have been previously associated with the stress response in Listeria monocytogenes: dnak, dnaJ, groEL, groES, and grpE are chaperone proteins involved in the temperature stress response [32,33], and clpP and lmo1138 are involved in the degradation of misfolded proteins in the acid response [34,35].
For the virulence network, the key proteins are sigB, cheA, flaA, fliI, flgL, fliP, motB, cheY, lmo0681, and lmo0693. The functions of these proteins are summarized in Table A3, and the functional enrichment analysis from STRING is summarized in Table A4. Nine of these ten proteins have been previously associated with virulence in Listeria monocytogenes: sigB is a sigma factor that contributes to the regulation of virulence gene expression [36]; cheA and cheY are chemotaxis proteins that signal flagellar motors [37]; flaA is the main flagellin protein [38]; fliI, fliP, and flgL are involved in flagellum synthesis [38,39]; motB is involved in motor control [39]; and lmo0681 is a flagellum synthesis regulator [40]. These proteins are primarily involved in motility-related functions, which are known to be virulence factors in bacteria [41].
For the antibiotic resistance network, the key proteins are sigB, flaA, cheA, cheY, lmo0693, fliM, lmo0700, flgB, flgC, fliG, fliH, lmo0698, fliD, flhB, flhA, flgK, and flgL. The functions of these proteins are summarized in Table A5, and the functional enrichment analysis from STRING is summarized in Table A6. Sixteen of these seventeen proteins are involved with chemotaxis and motility-related functions [42] and are not typically associated with antibiotic resistance. However, there are links between these functions and antibiotic resistance. A previous study found that chemotaxis and motility genes are over-expressed in Listeria monocytogenes strains in which penicillin-binding and other antibiotic response genes are also over-expressed [42]. The motility-related proteins flaA, flgB, and flgC play a role in biofilm formation and have been shown to be upregulated in response to bactericides [43,44]. Lastly, it has been shown that bacteria in biofilm exhibit increased antibiotic resistance compared to planktonic cells [45]. These observations and the results of this analysis support further studying of the role of these chemotaxis and motilityrelated proteins in relation to antibiotic resistance. The full antibiotic resistance network, Table S3, also contains multiple known resistance proteins. For example, the results include fosX, which confers fosfomycin resistance [19]; eight pencillin-binding proteins (lmo0441, lmo0550, lmo1438, lmo1855, lmo1916, lmo2229, lmo2754, and lmo2812) and two proteins involved in the regulatory network (fri, lisR), all related to cephalosporin resistance [46]; msrA, which confers macrolide and streptogramin B resistance [47]; and gyrA, lmo2089, and lmo2741, that all confer resistance to fluoroquinolones [47,48]. However, these were not determined to be highly interconnected nodes in the network.
Across the three individual networks there are a total of twenty-eight unique proteins (cheA, cheY, clpP, dnaJ, dnaK, flaA, flgB, flgC, flgK, flgL, flhA, flhB, fliD, fliG, fliH, fliI, fliM, fliP, groEL, groES, grpE, lmo0681, lmo0693, lmo0698, lmo0700, lmo1138, motB, and sigB). Two of the highest ranked proteins (sigB and flaA) are present in all three networks. The protein sigB is known to play a role in the regulation of the general stress response and virulence in Listeria monocytogenes [7,18,49], and this analysis demonstrates that sigB is also a key protein in the antibiotic resistance network. The protein flaA is a flagellar motility gene involved in biofilm formation [50]. It is highly interconnected in both the virulence and antibiotic resistance networks and is also involved in stress response. Lastly, there are three proteins, cheA, cheY, lmo0693, that are present in the top nodes for the virulence and antibiotic resistance networks.
Prior studies have investigated the key genes and proteins in the stress response [18,49], virulence [7,11,19,51], and antibiotic resistance [26] of various microorganisms. This work analyzes all three processes and their interaction as targets to combat Listeria monocytogenes.
To the best of the authors' knowledge, this is the first work to create and analyze a protein interaction network for antibiotic resistance and for the combined processes of stress response, virulence, and antibiotic resistance. In addition, while previous works have described genes and proteins for stress response and virulence, this work expands on those results by identifying a larger network of proteins and the key targets within the network. For example, Hecker et al. identified nine genes (sigB, gadCB, gadD, bsh, opuC, bilE, inlA, inlB, prfA) involved in the stress response in L. monocytogenes [18], while this work identifies seven key proteins expressed by different genes. Rantsiou et al., identified seven virulence factors (plcA, iap, hly, prfA, plcB, mpl, and actA) in L. monocytogenes [7], while this work identifies ten different key proteins expressed by different genes.
The results presented here provide the basis for further work to improve food preservation methods to reduce the prevalence of L. monocytogenes in food supply; decrease virulence to limit the severity of infections for people exposed to L. monocytogenes; and mitigate against the risk of antibiotic resistance in L. monocytogenes by identifying new treatments and synergistic compounds to maintain the effectiveness of current treatments for people infected by L. monocytogenes. New inhibitors for these target proteins can be evaluated using methods previously described in the literature [8,52]. An improvement in even a single area can have a positive outcome on the control of L. monocytogenes. For example, anti-virulence drugs can be developed to target virulence factors and used as alternatives to antibiotic treatments [53,54].
While this analysis has identified several protein targets for further study, there are potential disadvantages to the method. One disadvantage is that results may not include all known key proteins for Listeria monocytogenes, or they may not be highly ranked within the network. For example, prfA (lmo0200) is a known bacterial transcription factor that controls the expression of key virulence factors [51], but it was not highly interconnected within the network and therefore not included in the list of key proteins determined via the analysis. Additionally, STRING includes proteins in the network based on direct and indirect interactions in its database. Another disadvantage is that highly ranked proteins included in the network based on indirect interactions may not actually be key proteins for the biological process represented by the network. These disadvantages can be mitigated during the manual curation step by including or excluding specific proteins. Another path for further study of the networks generated in this analysis is to identify key proteins from other studies that were not included. The networks can then be manually curated and analyzed with these proteins included.

Conclusions
Listeria monocytogenes is a deadly and costly foodborne pathogen that is difficult to control and a significant concern for the food industry. Current methods to combat the pathogen can be improved through a better understanding of the processes of stress response, virulence, and antibiotic resistance and their interaction. The key proteins for these processes were determined through the creation and analysis of individual and combined protein interaction networks. Across the three individual networks, twenty-eight key proteins were identified (cheA, cheY, clpP, dnaJ, dnaK, flaA, flgB, flgC, flgK, flgL, flhA, flhB, fliD, fliG, fliH, fliI, fliM, fliP, groEL, groES, grpE, lmo0681, lmo0693, lmo0698, lmo0700, lmo1138, motB, and sigB). While all of these proteins are potential targets for new methods to combat L. monocytogenes, five of the twenty-eight proteins (sigB, flaA, cheA, cheY, and lmo0693) represent the most promising targets because they are key proteins in the combined network. These results provide a starting point for further work to identify new strategies to improve food preservation methods and treatments for L. monocytogenes.

Data Availability Statement:
The data presented in this study are available in the supplementary materials.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Additional information for highest ranked proteins from the stress response network. Table A1. Summary of top 7 nodes for stress response.

Rank
Node Function 1 groEL 60 kDa chaperonin; prevents misfolding and promotes the refolding and proper assembly of unfolded polypeptides generated under stress conditions.

dnaK
Heat shock 70 kDa protein; acts as a chaperone.

lmo1138
ATP-dependent Clp protease proteolytic subunit; cleaves peptides in various proteins in a process that requires ATP hydrolysis; has a chymotrypsin-like activity; plays a major role in the degradation of misfolded proteins; belongs to the peptidase S14 family.

clpP
ATP-dependent Clp protease proteolytic subunit; cleaves peptides in various proteins in a process that requires ATP hydrolysis; has a chymotrypsin-like activity; plays a major role in the degradation of misfolded proteins; belongs to the peptidase S14 family. Table A1. Cont.

grpE
HSP-70 cofactor; participates actively in the response to hyperosmotic and heat shock by preventing the aggregation of stress-denatured proteins, in association with DnaK and GrpE; it is the nucleotide exchange factor for DnaK and may function as a thermosensor; unfolded proteins bind initially to DnaJ; upon interaction with the DnaJ-bound protein, DnaK hydrolyzes its bound ATP, resulting in the formation of a stable complex; GrpE releases ADP from DnaK; ATP binding to DnaK triggers the release of the substrate protein, thus completing the reaction cycle; several rounds of ATP-dependent interactions between DnaJ, DnaK, and GrpE are required for fully efficient folding.

dnaJ
Chaperone protein DnaJ; participates actively in the response to hyperosmotic and heat shock by preventing the aggregation of stress-denatured proteins and by disaggregating proteins, also in an autonomous, DnaK-independent fashion; unfolded proteins bind initially to DnaJ; upon interaction with the DnaJ-bound protein, DnaK hydrolyzes its bound ATP, resulting in the formation of a stable complex; GrpE releases ADP from DnaK; ATP binding to DnaK triggers the release of the substrate protein, thus completing the reaction cycle; several rounds of ATP-dependent interactions between DnaJ, DnaK, and GrpE are required for fully efficient folding; also involved, together with DnaK and GrpE, in the DNA replication of plasmids through the activation of initiation proteins. 7 groES 10 kDa chaperonin; binds to Cpn60 in the presence of Mg-ATP and suppresses the ATPase activity of the latter

Appendix B
Additional information for highest ranked proteins from the virulence network.

Appendix C
Additional information for highest ranked proteins from the antibiotic resistance network. Table A5. Summary of top 17 nodes for antibiotic resistance.

Rank
Node Function 1 sigB RNA polymerase sigma factor; sigma factors are initiation factors that promote the attachment of RNA polymerase to specific initiation sites and are then released. One of three proteins involved in switching the direction of the flagellar rotation.

fliD
Flagellar hook-associated protein 2; required for morphogenesis and for the elongation of the flagellar filament by facilitating polymerization of the flagellin monomers at the tip of the growing filament; forms a capping structure, which prevents flagellin subunits (transported through the central channel of the flagellum) from leaking out without polymerization at the distal end.
14 flhB Membrane protein responsible for substrate specificity switching from rod/hook-type export to filament-type export.
14 flhA Membrane protein involved in the flagellar export apparatus.
14 flgK Flagellar hook-associated protein 1; with FlgL acts as a hook filament junction protein to join the flagellar filament to the hook.

Appendix D
Additional information for highest ranked proteins from the combined network.

flhB AR
Membrane protein responsible for substrate specificity switching from rod/hook-type export to filament-type export.
12 fliD AR Flagellar hook-associated protein 2; required for morphogenesis and for the elongation of the flagellar filament by facilitating polymerization of the flagellin monomers at the tip of the growing filament; forms a capping structure, which prevents flagellin subunits (transported through the central channel of the flagellum) from leaking out without polymerization at the distal end.