A Meta-Analysis of the Protein Components in Rattlesnake Venom

The specificity and potency of venom components give them a unique advantage in developing various pharmaceutical drugs. Though venom is a cocktail of proteins, rarely are the synergy and association between various venom components studied. Understanding the relationship between various components of venom is critical in medical research. Using meta-analysis, we observed underlying patterns and associations in the appearance of the toxin families. For Crotalus, Dis has the most associations with the following toxins: PDE; BPP; CRL; CRiSP; LAAO; SVMP P-I and LAAO; SVMP P-III and LAAO. In Sistrurus venom, CTL and NGF have the most associations. These associations can predict the presence of proteins in novel venom and understand synergies between venom components for enhanced bioactivity. Using this approach, the need to revisit the classification of proteins as major components or minor components is highlighted. The revised classification of venom components is based on ubiquity, bioactivity, the number of associations, and synergies. The revised classification can be expected to trigger increased research on venom components, such as NGF, which have high biomedical significance. Using hierarchical clustering, we observed that the genera’s venom compositions were similar, based on functional characteristics rather than phylogenetic relationships.

First introduced in 1993 [165], association rule mining has emerged as a popular technique in detecting and extracting key structural information from large-scale transaction data that is often generated in organizations, such as Krogers, Walmart, etc. [166]. These rules help the organizations understand the co-occurrence patterns and frequencies of various transactions, thus helping them become more efficient and profitable. By leveraging the similarity between the co-occurrence of protein components in venom, and transactions done by various shoppers in these supermarkets, we can use the powerful data-mining tools for discovering patterns in venom composition to boost the efficiency of biomedical research.
Let I = { i 1 , i 2 , i 3 , . . . , i n ) be a set of n protein components (referred to as "items" in data-mining literature) and D = {t 1 , t 2 , t 3 , . . . , t n } be the set of venom samples. An association rule is defined as an implication of the form X => Y where X, Y ⊆ I and X ∩ Y = ∅. The set of protein components (referred to as item sets) X and Y are called antecedent (left-hand side LHS or predictor) and consequent (right-hand side RHS or predicted) of the rule [167].
The support (supp) (X) of itemset X is defined as the proportion of venom samples in the data set which contain the itemset. Confidence of a rule is defined as conf (X => Y) = supp (X ∪ Y)/supp (X) [167]. Confidence can be interpreted as an estimate of the probability P (Y|X), the probability of finding RHS of the rule in the venom sample under the condition that these venom samples also contain LHS [167].
Lift of a rule is defined as (X => Y) = supp (X ∪ Y)/(supp(X)supp(Y)). It can be interpreted as the deviation of the support of the whole rule from the support expected under independence given the supports of the LHS and RHS [167]. Greater lift values indicate stronger associations [167].
In this study, we highlight the top 20 associations (Figure 2), e.g., Dis is associated with CTL with a confidence of 1 and support of 0.667 ( Table 2), implying that Dis and CTL are expressed together 66.7% times in venom of all species of Crotalus, and if CTL is expressed in venom, then Dis is expressed 100% times.
Crotalus' venom components are well studied to generate more than 500 associations, but only the top twenty relevant rules with at least 1 minor component are depicted in Table 2. If protein (predictor) is present in venom, then chances of the protein (predicted) to be expressed in the venom are given by combining "confidence" and "lift". Dis has the highest number of associations as a predicted component, which is 7: PDE, BPP, CRL, CRiSP, LAAO, SVMP P-I and LAAO, SVMP P-III and LAAO. Followed by LAAO with six associations: PDE, BPP, CTL, Dis and SVMP P-I, Dis, and CRiSP. On the other hand, CTL is associated with five groups, and CRiSP is represented by two associations. However, 5 associations of CTL have higher lift and confidence than LAAO's and Dis', indicating better associations.

Association between Various Venom Components in Crotalus Venom Using the Relative Abundance of Protein Components
A key challenge in inferring association between different species is the lack of data on relative abundance for venom components, e.g., for the Crotalus species, relative abundance is reported only for 14 out of the 30 species. Within these 14 species, relative abundance is reported for only 56.7% of the venom components. Using the limited data on relative abundances, we can identify a total of 47 association rules, also referred to as relationships between different venom components for Crotalus (Table S2). Herein we report only the top twenty relevant rules (Table 3, Figure 3).
Despite limited data, many relationships reported through presence/absence data are also reported through relative abundance data, such as CTL and LAAO, CTL and SVMP-PIII, CTL and SVSP, BPP and LAAO, and CRiSP and LAAO. Through relative abundance data, we identified several new relationships, such as PLA 2 and SVMP_PIII, SVMP_PII and CTL, etc. (Figure 3). We used hierarchical clustering analysis of venom components with known relative abundances to cluster different Crotalus species according to the similarity in venom components ( Figure 4). We found the clustering similar when we used maximum reported values of relative abundances for each venom component within a species to the average relative abundances of venom components within a species. One would expect similar venom composition in closely related species due to recent common ancestor, but such similarity was not observed. We conjecture that the reason different species have similar compositions is due to functional similarities. The venom composition of Crotalus durissus is different from the rest of the 14 species. Crotalus polystictus and Crotalus simus have similar venom composition, while Crotalus atrox, Crotalus bassilliscus, Crotalus tzabcan, Crotalus cerastes, Crotalus scutulatus, Crotalus viridis, Crotalus molossus, Crotalus vergandis, Crotalus tigris, Crotalus ruber, Crotalus horridus have similar venom composition ( Figure 4). Note: 5′-nucleotidase (5′-NT), bradykinin potentiate peptide (BPP), C-type lectins (CTL), cysteine-rich secretory protein (CRiSP), disintegrin (Dis), L-amino acid oxidase (LAAO), nerve growth factor (NGF), phosphodiesterase (PDE), phospholipase a2 (PLA2), snake venom metalloprotease (SVMP), and snake venom serine protease (SVSP).  Table 2. The size and the depth of color of the graph nodes are proportional to the support level and lift ratios of the underlying association rules.  Table 2. The size and the depth of color of the graph nodes are proportional to the support level and lift ratios of the underlying association rules. Note: 5 -nucleotidase (5 -NT), bradykinin potentiate peptide (BPP), C-type lectins (CTL), cysteine-rich secretory protein (CRiSP), disintegrin (Dis), L-amino acid oxidase (LAAO), nerve growth factor (NGF), phosphodiesterase (PDE), phospholipase a 2 (PLA 2 ), snake venom metalloprotease (SVMP), and snake venom serine protease (SVSP).   Table 3. The size and the depth of color of the graph nodes are proportional to the support level and lift ratios of the underlying association rules.
Despite limited data, many relationships reported through presence/absence data are also reported through relative abundance data, such as CTL and LAAO, CTL and SVMP-PIII, CTL and SVSP, BPP and LAAO, and CRiSP and LAAO. Through relative abundance data, we identified several new relationships, such as PLA2 and SVMP_PIII, SVMP_PII and CTL, etc ( Figure 3).   Table 3. The size and the depth of color of the graph nodes are proportional to the support level and lift ratios of the underlying association rules.

Venom Constituents in Sistrurus Venom
We identified compositional venom studies, through both transcriptomic and proteomic technologies, for 34 entries, including species and subspecies, within the genus Sistrurus. Few studies have focused on the Sistrurus subspecies' venom. 19 protein families are present in Sistrurus (Table 4). These protein families could be classified based on ubiquity or relationship with other proteins.

Association between Various Venom Components in
Sistrurus Venom Using Presence/Absence Data.
Using the frequent item-set data mining approach from data mining literature [164], we can identify eight relationships between different venom components for Sistrurus (Figure 6), e.g., NGF is associated with CTL with confidence = 1, support = 0.75 (Table 5), implying that NGF and CTL are expressed together 75% times in venom of all species of Sistrurus, and if NGF is expressed in venom, then 100% times CTL is also expressed.

Item frequency (relative)
Sistrurus Venom Components Frequency

Association between Various Venom Components in Sistrurus Venom Using Presence/Absence Data
Using the frequent item-set data mining approach from data mining literature [164], we can identify eight relationships between different venom components for Sistrurus (Figure 6), e.g., NGF is associated with CTL with confidence = 1, support = 0.75 (   Table 5. The size and the depth of color of the graph nodes are proportional to the support level and lift ratios of the underlying association rules.
In contrast to Crotalus' venom components, studies on Sistrurus' venom component are lacking, and thus, only a small pool of studies are used to generate only eight associations, as depicted in Table 5. CTL and NGF each have three associations with different venom components. CTL is associated with NGF, SVMP inhibitor, and SVSP; NGF is associated with CTL, SVMP inhibitor, and SVSP. They are followed by SVMP inhibitor and SVSP with one association each: SVMP inhibitor is associated with SVSP and vice versa. However, SVMP inhibitor and SVSP's associations have higher lift and confidence than CTL's and NGF's, indicating better associations.   Table 5. The size and the depth of color of the graph nodes are proportional to the support level and lift ratios of the underlying association rules. In contrast to Crotalus' venom components, studies on Sistrurus' venom component are lacking, and thus, only a small pool of studies are used to generate only eight associations, as depicted in Table 5. CTL and NGF each have three associations with different venom components. CTL is associated with NGF, SVMP inhibitor, and SVSP; NGF is associated with CTL, SVMP inhibitor, and SVSP. They are followed by SVMP inhibitor and SVSP with one association each: SVMP inhibitor is associated with SVSP and vice versa. However, SVMP inhibitor and SVSP's associations have higher lift and confidence than CTL's and NGF's, indicating better associations.

Association between Various Venom Components in Sistrurus Venom Using the Relative Abundance of Protein Components
Similar to Crotalus, a key challenge in inferring association between different species is the lack of data on relative abundance for venom components. There are only six studies that reported relative abundances for all species and subspecies within the genus Sistrurus. Relative abundance is reported for 21 out of 25 venom components. We identified a total of 13 associations (Table 6). Nine associations are discarded as they did not have any predictor component or are duplicates.  Figure 7).  Table 6.

Hierarchical Clustering of Venom Components to Identify Similarities or Dissimilarities in Phylogenetic Relationships.
Next, we use hierarchical clustering analysis of venom components with known relative abundances to cluster different Sistrurus species according to the similarity in venom components (Figure 8). One would expect similar venom composition in closely related species due to recent common ancestor, but such similarity is not observed. We conjecture that the reason different species have similar compositions is due to functional similarities. We found that the venom compositions of Sistrurus miliarius miliarius and Sistrurus miliarius strecki are similar. However, Sistrurus miliarius barbouri, phylogenetically similar to Sistrurus miliarius miliarius and Sistrurus miliarius strecki, does not have similar venom composition to these two subspecies of Sistrurus miliarius. Instead, its venom composition is similar to that of Sistrurus catenatus tergeminus and Sistrurus catenatus catenatus. Sistrurus catenatus edwardsii have a different venom composition than the other two subspecies (Figure 8).  Table 6.

Hierarchical Clustering of Venom Components to Identify Similarities or Dissimilarities in Phylogenetic Relationships
Next, we use hierarchical clustering analysis of venom components with known relative abundances to cluster different Sistrurus species according to the similarity in venom components (Figure 8). One would expect similar venom composition in closely related species due to recent common ancestor, but such similarity is not observed. We conjecture that the reason different species have similar compositions is due to functional similarities. We found that the venom compositions of Sistrurus miliarius miliarius and Sistrurus miliarius strecki are similar. However, Sistrurus miliarius barbouri, phylogenetically similar to Sistrurus miliarius miliarius and Sistrurus miliarius strecki, does not have similar venom composition to these two subspecies of Sistrurus miliarius. Instead, its venom composition is similar to that of Sistrurus catenatus tergeminus and Sistrurus catenatus catenatus. Sistrurus catenatus edwardsii have a different venom composition than the other two subspecies (Figure 8).

Discussion
A total of 46 families of proteins are identified in the venom of 34 species and subspecies of rattlesnakes. Most studies focus on Crotalus, and a subset of studies focus on Sistrurus. Through our analysis, using the presence/absence of venom components, we can discover a total of 562 association rules for Crotalus and 25 association rules for Sistrurus venom components. In this study, we present the 20 most relevant rules for Crotalus and eight rules for Sistrurus venom components, respectively (Tables 2 and 5). Using the known relative abundances of venom components, we discovered 47 rules for Crotalus and 13 rules for Sistrurus venom components (Tables 3 and 6).
Using presence/absence data in developing venom component association only gives limited insight as venom becomes functionally different with changes in relative abundances of its components. However, we have been limited by existing information on the relative abundances of venom components. Within Crotalus, relative abundances have been reported for 46% of the species, and within these species, relative abundances have been reported for only 56% of venom components. Within Sistrurus, for all species and subspecies, relative abundances have been reported for 84% of venom components. Reporting relative abundances of different venom components would play a critical role in developing more insightful associations between different venom components.
There is an emphasis on investigating venom components stand-alone units with a lack of investigations of their relationships with each other and the subsequent effects of co-administering different components. On the other hand, understanding the relationship between venom components could open a new avenue for biomedical research and

Discussion
A total of 46 families of proteins are identified in the venom of 34 species and subspecies of rattlesnakes. Most studies focus on Crotalus, and a subset of studies focus on Sistrurus. Through our analysis, using the presence/absence of venom components, we can discover a total of 562 association rules for Crotalus and 25 association rules for Sistrurus venom components. In this study, we present the 20 most relevant rules for Crotalus and eight rules for Sistrurus venom components, respectively (Tables 2 and 5). Using the known relative abundances of venom components, we discovered 47 rules for Crotalus and 13 rules for Sistrurus venom components (Tables 3 and 6).
Using presence/absence data in developing venom component association only gives limited insight as venom becomes functionally different with changes in relative abundances of its components. However, we have been limited by existing information on the relative abundances of venom components. Within Crotalus, relative abundances have been reported for 46% of the species, and within these species, relative abundances have been reported for only 56% of venom components. Within Sistrurus, for all species and subspecies, relative abundances have been reported for 84% of venom components. Reporting relative abundances of different venom components would play a critical role in developing more insightful associations between different venom components.
There is an emphasis on investigating venom components stand-alone units with a lack of investigations of their relationships with each other and the subsequent effects of co-administering different components. On the other hand, understanding the relationship between venom components could open a new avenue for biomedical research and unlock protein combinations that yield enhanced bioactivity in pharmaceutical drugs. Additionally, studying components as stand-alone may have produced a negative effect in which many components have received skewed attention in biomedical research. For example, protein families are often classified as major or minor based on importance and ubiquity [13,128]. Thus, causing the dominant protein families, such as proteases, neurotoxins, and phospholipases, to be more researched than other protein families, such as growth factors. However, it is by combining ubiquity, bioactivity, and relationship between the protein families that we can classify the venom components as major or minor.
In rattlesnakes, MYO, PLA 2 , SVMP, and SVSP are classified as major components based on medical importance and ubiquity [13,128], which is also confirmed by our analysis (Figure 1). However, with a new approach of using both ubiquity and number of associations for each protein, we find that Dis, LAAO, CTL are all more ubiquitous and have more associations with other proteins in Crotalus species (Table 2). Similarly, in Sistrurus species, the SVMP inhibitor and NGF (Figure 4) have the most associations than MYO, which has only one association (Table 3).
These associations play a critical role in the synergy between venom components [73]. This synergism causes the joint effects of multiple toxins to assert greater effects than the sum of individual potencies [73], making trace amount of snake venom to be highly efficient and effective [73,74]. Such combinations of venom proteins often cause various symptoms of bleedings, tissue degradation, necrosis, and further complications in prey and bite victims [69,177] and improve the lethality of whole crude venom in contrast to individual components [73,178].
Through mostly studies of predominant toxins, different general mechanisms for toxin synergisms have been proposed [73,179]: (1) Two or more toxins interact with different targets on related biological pathways, resulting in synergistically increased toxicity; (2) Two or more toxins recognize and interact with the same target synergistically and produce the same effect, and is often called amplification; (3) One toxin (subunit) acts as a chaperone to potentiate another one. The chaperone may expose the active/functional site of the second toxin (subunit), or expose target sites, or increase affinity to target or modify the active surface of the other toxin (subunit). Such complexes usually dissociate after asserting their toxicity.
Synergisms are mostly reported for major toxins in rattlesnake venoms [73]. A notable example of synergism through complex formation (mechanism 3) is crotoxin, a lethal neurotoxin from C. durissus terrificus, by two subunits: an acidic subunit component A (CA or crotapotin) and a basic subunit component B (CB) [109,115,116,180,181]. CB is identified as a basic PLA 2 with phospholipase activities and low toxicity, while the CA component is said to be a small acidic, nonenzymatic, nontoxic subunit [73,181]. However, once combined non-covalently, CA improves the potency of CB by enabling CB to reach the specific crotoxin receptors at the neuromuscular junction as well as inhibits other CB functions, such as catalytic and anticoagulant activities [115,181]. Thus, the resulting crotoxin complex is highly active, compared to individual components, showing the synergy between two subunits blocking acetylcholine release [180,181]. Similarly, in C. scutulatus scutulatus, the Mojave toxin is another PLA 2 complex: one acidic and nonlethal subunit acts as a chaperon for the other basic subunit to improves lethality [41,148,162]. Other examples of synergistic complexes have been found and reported in many species of Viperidae and Crotalidae [73]. Such interactions show the strong synergistic activities in rattlesnake venoms that have been studied intensively through previous endeavors.
A prevalent example between major components is SVMP P-III and an acidic PLA 2 in Bothrops alternatus called baltergin and Ba SpII RP4 PLA 2 , respectively [182,183]. The more abundant PLA 2 has no myotoxic activities, while the less abundant baltergin possesses high edematogenic and myotoxic activities [182], while PLA 2 has no myotoxicity, although it is the most abundant PLA 2 in this species [183]. When acting simultaneously, both can cause complete detachment of C 2 C 12 myoblast cells, while none can achieve 50% of detachment on their own [184]. The analogous synergism has also been recorded in endothelial cells, SVMP's natural target [73,185]. The mechanism of synergism for such interaction is proposed through interactions with endothelial cells' membranes, free of catalysis rather than enzymatic activities of PLA 2 [185]. Since PLA 2 does not target extracellular matrix proteins like SVMP [182], indicating that the second general mechanism of toxin synergism is followed. Both enzymes are present in many rattlesnakes' venoms (Tables 1 and 3), and their association is also reported through our analysis (Table S1). There are reports indicating the synergism between crotoxin and crotamine, a member of MYO toxins in Crotalus venoms, which facilitates the internalization of the CB subunit and increases neuronal toxicity [73,186]. Unfortunately, these interactions are not found in the analysis (Table 2), although they are present in Crotalus venoms (Table 1), which could be due to the sparse reports on Crotalus' venoms with many species are still under-investigated as stated previously.
Even fewer studies focus on the synergism between major and minor components: SVSP, a major toxin, and BPPs, a minor toxin [73], indicating a biased approach in studying venom toxins produced by the current major/minor toxin classification convention. BPPs, which are micromolecular hypotensive peptides in snake venoms, can inhibit angiotensinconverting enzymes and induce hypotensive action of bradykinin, accompanied by hyperpermeability of blood vessels [65,107,187,188]. Thus, BPPs are targeted for many pharmaceutical developments to treat hypertension and heart failure [189,190]. On the other hand, many SVSPs show activities that are similar to kallikrein, a serine proteinase, with the specific and limited proteolytic functions that release bradykinin [73,143,191]. Previous works indicated that BPPs could act synergistically with kallikrein-like SVSPs, which release bradykinin more effectively than endogenous kallikrein to produce potent hypotension and vascular shock in prey [73,95,143,[192][193][194] (mechanism 1). Similarly, SVSP-BPP interaction results in a stronger physiological effect than from individual components. However, studies dedicated to understanding the mechanism and effects of such interactions are limited. The absence of such studies highlights the bias in current classification systems of major and minor venom components. Likewise, there are many components (e.g., LAAO) with substantial associations with other toxins, like CTL or NGF, that have not been investigated for their potential synergisms. Therefore, there is a need to develop a deeper understanding of minor components in the venom of rattlesnakes to discover more associations, such as that of SVSP and BPP.
Another way to explain the associations of these toxins is through the evolution of toxins. One relationship that has been explored in previous studies is between SVMP P-III and Dis. Dis is a small, nonenzymatic protein that can bind to extracellular receptors (integrins) with many motifs and sizes, two of which are RGD and MVD motifs [9,68,144,195,196]. While SVMP-PIII is a subclass of SVMP with a Dis-like domain [197,198], Dis, especially the RGD/MVD motifs, is suggested to be produced from the rapid evolution of the genes coding of SVMP-PIII [195,196,199]. The RGD/MVD motifs of Dis are presented in many Crotalus species [195] along with SVMP-PIII, represent as rule 17 (Table 2) can be explained through this evolution model, although the co-association with LAAO is still largely unknown.
Some associations may not need to be derived through their toxicity but could be explained through the proteins' housekeeping functions. The existence of SVMP inhibitor is thought to be a housekeeping molecule, despite its potential therapeutic activities, which helps neutralize the potent SVMP in the venom glands as a self-defense mechanism [200]. Yet, not many studies have been invested in this family, along with the lack of occurrence in many rattlesnake venoms, where high amounts of SVMP exist (Table 1), which indicates a knowledge gap that requires further investigations. Likewise, NGF is known for its ability to inhibit SVMP proteolysis in Viperidae [201,202]. However, growing evidence has suggested the plausibility of other mechanisms in which NGF can act as cytotoxic proapoptotic factors in tissues that do not have TrkA receptors [201,203,204]; or as ancillary functions, like Hya, to help with efficient absorption of venom component through the release of granules molecules (histamine, serotonin, etc.) [179,201,205]. Such large release can also have impactful consequences (anaphylaxis, bronchoconstriction, vasodilation, etc.) [201,206]. However, not many Crotalus species have NGF, as observed previously in Table 1, indicating yet another gap of knowledge in Crotalus venomics. Using SVMP as a common targeting model to explain the association of NGF and SVMP inhibitor (rule 6, Table 5) is promising, but due to the insufficient amount of information provided, such explanation warrants further attempts in co-administration testing to confirm.
Developing phylogenetic relationships between different species using venom composition would further explain the various associations between different venom components as two species can have similar venom composition due to recent common ancestor or functional reasons. The existing venom composition data is greatly insufficient for developing the phylogenetic tree ( Figure 4). However, the hierarchical clusters developed using known relative abundances show a different relationship than the observed phylogenetic relationships [207]. Even though C. durissus is similar to C. basiliscus, the venom composition of the latter is much more similar to that of C. molossus. The venom composition of C. durissus is unique compared to other species (Figure 4). Similar patterns can be observed with Sistrurus species (Figure 8). The possible explanations for this may be either functional similarity or insufficient data. However, one would expect the cluster structure to change with changes in venom composition due to age, sex, diet [17,18,24,25], or topographical features [26,27]. We did not observe any difference between clusters built using the maximum relative abundance value or the average values. However, this result should be taken with caution as only a few studies reported multiple relative abundance values for some of the venom components. Thus, to have a deeper insight into the venom associations and venom component relationships, relative abundances must be quantified and reported.

Conclusions
In this paper, we have elicited the associations between different venom components. Thus, expediting future research on the synergy between various venom components. We also establish the need to report relative abundances for different venom components to increase the accuracy of the predicted associations and the understanding of venom evolution.
The results of this study suggest a myriad of associations, many of which are yet to be discovered, but they do provide promising potential synergistic effects that are worth further investigation. For example, using rules 2 and 13 in Crotalus venoms (Table 2), CTL, a protein/glycoprotein that specifically binds to carbohydrate moieties and glycoconjugate, can target and interact with platelet receptors and blood coagulation factors [208], which are also targets for Dis [209], indicate their potential synergisms with antiplatelet toxins and assert the hypotensive results along with many other toxin groups like SVSP [73,95,143,[192][193][194]. Thus, highlighting the importance of characterizing toxin components and their associations [69]. With an increased amount of characterization studies, novel families may also be correctly added into the venom profiles, such as three-finger toxins (3FTx), which often are present in elapids and a few occasions in rattlesnakes genome and transcriptome [14][15][16]169,171]. However, attention should be paid to developing venom profiles for understudied genera (Sistrurus) or species. Additionally, this work also addresses the problem of conventional classification of venom toxins as major or minor based on importance and ubiquity, which are often MYO, PLA 2 , SVMP, and SVSP [13,128], as the cause of much more attention on these dominant toxin families and overlooking other protein families, such as Growth factors. Therefore, we highlight the importance of studying venom components not only as individual components but also in understanding the relationship between them. We propose using the combination of toxin's characteristics, such as its ubiquity, bioactivity, and associations with other toxin families, to classify the venom components as major or minor.

Materials and Methods
We collected articles and abstracts on venom for each Crotalus and Sistrurus species through the following: databases (PubMed, ScienceDirect, Scopus, Google Scholar, Web of Science), journal's databases (BMC Genomics, Journal of Proteome Research, Journal of Proteomics, Toxicon, Toxicology, Toxins), publisher databases (Wiley Online Library, MDPI, Elsevier). We used "venom" OR "proteomic" OR "venomic" OR "transcriptome" OR "proteome" AND "name of the species" as keywords for conducting our search. We also examined references in studies produced from the search results for any additional information. Collected records were the earliest obtainable records to those that are published in January of 2020.
From collected records, any article that did not contain information regarding venom composition and components of any Crotalus and Sistrurus species was not used in the current analysis. Otherwise, the articles' full-text version would be further assessed with the following inclusion and exclusion criteria.
For the article to be included in the current analysis, it had to fulfill one of the following inclusion criteria: (1) report proteome or transcriptome profile of the venom of any corresponding species; (2) report at least 1 toxin family/component, which is not artificially synthesized based on another similar toxin component; (3) be a comparative study reporting transcriptome/proteome profile for Crotalus, Sistrurus species/subspecies; (4) studies that report variability in venom components for any Crotalus, Sistrurus species/subspecies.
The following exclusion criteria were used to exclude any study from the current analysis: (1) reviews that focus on toxin families and/or articles focuses on the genomic evolution of toxin families; (2) articles with no transcriptome/proteome profiles; (3) articles with no data on toxin family isolated from venom; (4) articles that focus on new artificially synthesized molecules, based on similar toxin component or recombinant protein/peptides in venom; (5) articles reporting methods to inactivate toxin family from rattlesnakes; (6) case study on rattlesnakes' bites; (7) studies describing methods to detect toxin families/components. From the studies that fulfilled our inclusion criteria and did not meet any exclusion criteria, we collected and compiled all venom constituents that are reported for each species in the genus Crotalus and Sistrurus in Tables 1 and 4 respectively. The compiled data were cross-checked by authors for correctness and confirmations.
Using the data from Tables 1 and 4, we performed two separate frequent item-set data mining analyses for Crotalus and Sistrurus venoms. We conducted frequent item-set data mining using presence-absence data and a separate analysis using relative abundance values. In the analysis using relative abundance values, when more than one value for relative abundance was reported for a particular protein, we used the maximum value of relative abundances reported. There was no major difference in the results when we used maximum reported values versus the average of all reported values for a particular protein. For all values that are reported as below the limit of detection, we used the limit of detection as the value for that particular component [210]. Frequent item-set data mining helps identify the association rules associated with the expression of different proteins in venom. Studies on Sistrurus venom components are sparse, thus, can introduce a bias towards data-mining analysis. The rules specify the confidence, lift, and support for specific proteins to occur together in venom. Support is defined as absolute frequency, i.e., a support of 25% means that venom components x, y, and z occur together in 25% of all venoms. Confidence is correlative frequency., i.e., a confidence of 60% means that if x and y occur, then 60% of times z will also occur. Lift signifies the likelihood of the y occurring when x occurs while taking into account the number of times venom component y occurs in different species. An association rule is valid only if the lift is greater than 1. The higher the value of the lift, the higher is the validity of the rule. Since many studies associated with rattlesnake venom concentrated on highly abundant species or species containing more "major components", this affects the performance of the statistical models due to the presence of null values. For the analysis using only the presence-absence data for toxin families from individual studies, the chances of bias from individual studies affecting our results were low. With the increase in venom composition and variation data, the associations produced by frequent item-set data-mining analysis will be more informative. Using the relative abundance data of venom components from Tables 1 and 4, we performed hierarchical clustering for both Crotalus and Sistrurus species. For species with multiple values reported for the same venom component, we used the maximum of all reported values in our analysis. All analysis was performed using the software R (R Core Team, Vienna, Austria, 2019).

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/10 .3390/toxins13060372/s1, Table S1: Depictions of full association rules between proteins expressed in Crotalus venom using presence/absence data. Table S2: Depictions of all association rules between proteins expressed in Crotalus venom using relative abundance data.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: 3FTx Three-finger toxin 5 -NT