Next Article in Journal
Optimal Machine Learning Based Privacy Preserving Blockchain Assisted Internet of Things with Smart Cities Environment
Previous Article in Journal
A Review of Neural Network-Based Emulation of Guitar Amplifiers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Analysis of Gut Microbiome Structure Based on GMPR+Spectrum

Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2022, 12(12), 5895; https://doi.org/10.3390/app12125895
Submission received: 17 April 2022 / Revised: 2 June 2022 / Accepted: 6 June 2022 / Published: 9 June 2022
(This article belongs to the Section Applied Biosciences and Bioengineering)

Abstract

:
The gut microbiome is related to many major human diseases, and it is of great significance to study the structure of the gut microbiome under different conditions. Multivariate statistics or pattern recognition methods were often used to identify different structural patterns in gut microbiome data. However, these methods have some limitations. Minimal hepatic encephalopathy (MHE) datasets were taken as an example. Due to the physical lack or insufficient sampling of the gut microbiome in the sequencing process, the microbiome data contains many zeros. Therefore, the geometric mean of pairwise ratios (GMPR) was used to normalize gut microbiome data, then Spectrum was used to analyze the structure of the gut microbiome, and lastly, the structure of core microflora was compared with Network analysis. GMPR calculates the Intraclass correlation coefficient (ICC), whose reproducibility was significantly better than other normalization methods. In addition, running-time, Normalized Mutual Information (NMI), Davies-Boulding Index (DBI), and Calinski-Harabasz index (CH) of GMPR+Spectrum were far superior to other clustering algorithms such as M3C, iClusterPlus. GMPR+Spectrum can not only perform better but also effectively identify the structural differences of intestinal microbiota in different patients and excavate the unique critical bacteria such as Akkermansia, and Lactobacillus in MHE patients, which may provide a new reference for the study of the gut microbiome in disease.

1. Introduction

The Gut microbiome is associated with many major human diseases, such as obesity, diabetes, cirrhosis, autism, allergies, inflammatory bowel diseases, cardiovascular diseases, multiple cancers, and depression. Therefore, the gut microbiome may become a recent target for interventional therapies and play an essential role in diagnosing, analyzing, and treating these major diseases [1]. Microbiome studies are extensively used to analyze the microbial communities’ composition and diversity of the flora. They are used to study one of the fundamental questions of microbial ecology: how many taxa or OTUs (operational taxonomic units) exist? Usually, multivariate statistical or pattern recognition methods are used to identify different structural patterns in microbial data, such as principal component analysis (PCA) [2,3,4], principal coordinate analysis (PCoA) [5,6,7], partitioning around medoid (PAM) clustering [8,9], etc. However, this standard multivariate technique does not applicable to highly diverse microbial data [10]. On the one hand, microbial data with high diversity tend to have sparse data sets, and on the other hand, most taxonomic units occur in only a few samples with low abundance. In addition, microbial genetic samples differ in reading length: small samples are inherently noisier than large samples.
Microbiome sequencing data contains many zeros due to physical deficiencies or under-sampling during the microbiome sequencing process. The complex processes involved in the sequencing process cause the depth of sequencing to vary with the sample, sometimes varying by several orders of magnitude. Therefore, the sequenced intestinal flora data are characterized by large data volume, a large number of OTUs, and sparse distribution [11]. Normalization is crucial as it aims to correct or reduce bias caused by the sequencing depth and is an essential pre-processing step before any downstream statistical analysis of high-throughput sequencing experiments [12,13]. Several normalization methods are commonly used for sequencing data, especially for RNA-Seq data [12,13]. Other popular methods for normalization of microbiome data, besides the size-factor-based methods, such as the geometric mean of pairwise ratios (GMPR), Trimmed mean of M values (TMM), and Relative Log Expression (RLE), are all methods of sparsification. The above methods have disadvantages and advantages in specific applications. Sparsity, for example, leads to discarding most reads and may not be optimal from an information point of view. However, it is still widely used for microbiome data analysis, especially for α and β diversity analysis. In addition, it suffers from a significant power loss due to the discarding of a large number of reading operations [14]. Instead, size factors can be included as offsets in a count-based parametric model to address the problem of uneven sequencing depth [15]. In comparison, GMPR consistently showed the best level of variability in reducing OTUs at different prevalence levels and increased reproducibility among replicates normalized to the abundance of taxonomic units [16]. In addition, GMPR normalization has been studied for distance-based (weighted) statistical methods such as ranking, clustering, and PERMANOVA based on GMPR-normalized abundance data [17,18].
Clustering analysis plays an essential role in data mining and has many applications in image processing, data analysis, market research, pattern recognition and other fields [19,20,21]. In recent years, spectral clustering has become one of the widely used clustering algorithms [22]. Compared with traditional clustering methods, it is more adaptable to data distribution, especially for data sets with different densities, random complex shapes, and unstable sizes, and the computational effort in clustering is much smaller and less complicated to implement. It is also much less computationally intensive, not very complex to implement, and has higher performance. In contrast, Spectrum [23] used in this paper enhances the similarity between points sharing nearest neighbors using a self-adjusting density-aware kernel. The data integration and diffusion process through tensor product maps reduce noise, reveals the underlying structure, and automatically finds the optimal number of clusters K by analyzing the feature vector distribution. The algorithm can find clustering of arbitrary data shapes, noisy data in the dataset can be handled efficiently, the number of clusters K of Gaussian and non-Gaussian structures can be found automatically, the running time is short, and good clustering results can be shown for large data sets.
In this paper, we first normalized the gut flora data using GMPR [16] algorithm and then analyzed them using Spectrum. The gut microbial datasets of patients with minimal hepatic encephalopathy, hepatic encephalopathy, and healthy controls were used as examples. Minimal hepatic encephalopathy (MHE) is a very insidious stage in the pathogenesis of hepatic encephalopathy (HE), and studies [24,25,26,27,28,29] have shown that the prevalence of MHE reached 20–80% in patients with cirrhosis. MHE is a common complication of liver disease, typically characterized by altered neurocognitive function [30,31,32,33], with an unnoticeable onset and no obvious clinical manifestations of HE, and the cognitive dysfunction caused by MHE can consume many medical resources and impose a great financial burden on patients and their families. Due to the high prevalence of MHE, its harmful effects, and the complexity of clinical diagnosis, more clinical attention has been paid to early screening and diagnosis of MHE. In addition, more and more studies [34,35] have shown that dysbiosis of the gut microflora was associated with MHE and the occurrence of HE. Since the traditional screening and diagnostic methods used in clinical practice are time-consuming and subject to human factors, it is essential to identify structural changes in the gut flora data of MHE and HE patients.

2. Materials and Methods

2.1. Materials

The datasets used in this paper were obtained from 77 samples collected from the Department of Gastroenterology of the First People’s Hospital of Yunnan Province, including 26 patients with minimal hepatic encephalopathy (Abbreviated as M), 25 patients with hepatic encephalopathy (Abbreviated as H), and 26 normal healthy controls (Abbreviated as N). The data collection process was as follows: (1) Sampling of fresh stool from samples; (2) Storage in liquid nitrogen within 2 h; (3) Storage in −80 °C refrigerator; (4) Extraction of fecal microbial DNA by kit method [36], and completion of 16SrRNA high-throughput sequencing according to standard operating instructions [37]. (5) After splicing the original sequences, performed quality control, selected representative sequences (OTU), clustered them, and then performed species annotation. (6) The OTUs count table after sequencing was obtained. The data collected in this study were approved by the ethics committee of the First People’s Hospital of Yunnan Province, and all subjects signed an informed consent form. As shown in Table A1 that a partial table of OTUs counts after sequencing, where rows (OTU_0, OTU_1, OTU_2, …) represent OTUs counts and columns (H1, H2, H3, …) represent sample ID numbers of patients with hepatic encephalopathy. The data used in the experiments were absolute abundance data.

2.2. Methods

The flow chart of the flora data processing in this study is shown in Figure 1. The experimental data were first normalized by GMPR, then clustering using the Spectrum algorithms, and compared with Spectrum without GMPR, M3C [38], and iClusterPlus [39] in terms of performance metrics including Normalized Mutual Information (NMI), Davies-Boulding Index (DBI), Calinski-Harabasz Index (CH) and algorithm running time and finally compared with the network analysis method for core flora.

2.2.1. Geometric Mean of Paired Ratios (GMPR)

GMPR [16] is a normalized method specially used to solve the problem of zero expansion of data. In principle, it can be applied to any sequencing data. It is mainly to solve some situations, such as many zeros in the data and different sequencing depths due to physical lack or insufficient microbial sampling.
The OTUs count table in the paper is the absolute abundance, which contains 77 samples, 1442 OTUs. GMPR is used to calculate the size factor of a given sample, and the size factor could estimate the library size of a given sample. The formula is as follows:
The first step is to calculate r i j ,
r i j = k { 1 , , q } | c k i . c k j 0 M e d i a n { c k i c k j }
where k is the number of OTUs, r i j is the median count ratio representing the non-zero counts between sample i and sample j, and C ki , C kj are expressed as the abundance data of the kth OTU in the sample i and sample j.
Then calculate the size factor s i of a given sample i,
s i = ( j = 1 n r i j ) 1 n , i = 1 , , n
In short, the basic step of GMPR was to first compare pairs of samples in the OTU count table, and then combined the paired comparison results to obtain the final estimated value.

2.2.2. Other Normalization Methods

Two popular normalization methods for RNA-Seq data include trimmed mean of M values (TMM) [40], and relative log expression normalization (RLE) [41]. TMM method selects a reference sample first, and all other samples are compared to the reference sample. The log ratios’ trimmed (weighted) mean is then calculated as the TMM size factor (log scale). RLE method calculates the geometric means of all features as a reference, and all samples are compared to the reference to produce ratios (fold changes) for all features. The median ratio is then taken to be the RLE size factor.

2.2.3. Spectrum Algorithm

Spectrum [23] is a new spectral clustering method, its idea is to view the data analysis problem as an optimal partitioning problem of the graph, where all OTUs are viewed as vertices in the space, and the vertices are connected with edges with weights. The edges with weights can be regarded as the similarity in OTUs. The key to this algorithm is that the self-adjusting density-aware kernel is employed to construct the similarity matrix, with the advantage that the similarity between the nearest neighbors can be further enhanced, while it can reduce noise. Spectrum can find the optimal number of clusters (K) involving the distribution of feature vectors, regardless of Gaussian or non-Gaussian structure [23].
The similarity matrix A * is computed using the adaptive density-aware kernel in the Spectrum algorithm. Starting with A * , the Ng spectral clustering method is used in Spectrum. At the same time, the number of clusters is estimated using an eigenvalue heuristic. Finally, the eigenvector matrix is clustered using Gaussian mixture modeling (GMM) to obtain the final output, i.e., a delineation of the feature clusters represented by the OTUs.
  • The adaptive density-aware kernel is first used in Spectrum algorithm to compute the similarity matrix between different OTUs.
The adaptive density perception kernel is:
A i j = exp ( d 2 ( s i s j ) σ i σ j ( C N N ( s i s j ) + 1 ) )
where d ( s i , s j ) represents the Euclidean distance between point s i and s j , σ i and σ j are the local scaling parameter, CNN ( s i , s j ) is the number of points in the connection area of the ε neighborhood around the point s i and s j , and the ε -neighborhood of the point represents the radius of the sphere around the point.
2.
The diagonal matrix D is obtained from A * , the diagonal matrix where (i,i) element is the i th row of the sum of A * , and the normalized Laplacian matrix L is constructed using D.
L = D 1 2 A * D 1 2
3.
Decompose eigenvalues of L and extract its eigenvectors X 1 , X 2 ,… X N + 1 and eigenvalues λ 1 , λ 2 ,… λ N + 1 .
4.
Determine the difference in eigenvalues, start with the second eigenvalue, i.e., n = 2 , and choose the optimal k, the difference in eigenvalues is maximized and denoted by k * .
k * = n arg max ( λ n λ n + 1 )
5.
Obtain the largest eigenvectors K * and then form the matrix (each eigenvector is arranged in columns to form n vectors in a k * -dimensional space), i.e., X = [ x 1 , x 2 , x k * ] R N + k * .
6.
Form the matrix Y from X by renormalizing each of X’s rows to have unit length.
Y i j = X i j ( j x i j 2 ) 1 2
7.
Finally, each row of Y is considered as an OTU feature s i , and finally all OTUs are clustered into k * clusters using GMM. The obtained class labels are the class labels of the original OTUs.

2.2.4. Monte Carlo Reference-Based Consensus Clustering (M3C)

Genome-wide expression data are stratified using clustering algorithms to stratify patients for precision medicine. The Monti consensus clustering algorithm [42], a widely used method, determines the number of clusters (K) by the stability selection principle. The algorithm works by resampling and clustering the data in each cluster and calculating an N*N consensus matrix. Each element represents the proportion of time that two samples are clustered together. A fully stable matrix consisting entirely of zeros and ones represents whether all sample pairs are clustered or not in the resampling iteration. The next step is to compare the stability of these consensus matrices to determine K. The fuzzy clustering ratio (PAC) score [43] is used to evaluate the stability of the consensus matrix for each K. However, it is biased towards larger values of K. In contrast another widely used delta K metric is more subjective in finding K as it relies on finding an elbow point and is not as good as the PAC score. Monte Carlo Reference-based Consensus Clustering (M3C) [38] addresses these issues by comparing the true stability scores with the expected scores under a stochastic model.M3C uses Monte Carlo simulation to generate a distribution of stability scores along with a range of K by comparing it with actual stability scores to determine the optimal K and reject the null hypothesis K = 1.

2.2.5. IClusterPlus

IClusterPlus was developed for comprehensive cluster analysis of multi-type genomic data [39]. Multi-type genomic data such as array comparative gene hybridization (aCGH), gene expression microarray, RNA-seq, DNA-seq, etc. iClusterPlus samples a range of lambda values from the parameter space based on a unified design to search for the best model [44]. The number of points to sample (n.lambda) depends on the number of data types. If the number of clusters in the sample is known, the corresponding k (the number of latent variables) can be directly selected for cluster analysis. If the number of clusters is not known in advance, k can be tested from 1 to N (a reasonable number of clusters). For each k, Bayesian Informative Features (BIC) is used to select the best sparse model with the best combination of penalty parameters. To choose the best k, by calculating the deviation rate, i.e., log-likelihood (fit)-log-likelihood (null model) divided by log-likelihood (full model)-log-likelihood (null model) ratio. Deviation rate can be interpreted as EV percentage. Choose k where the percentage EV curve plateaus the optimal number of clusters.

2.2.6. Network Analysis

Microbial networks are an increasingly popular tool for studying the structure of microbial communities, as they integrate multiple types of information and may represent system-level behavior. The analysis of microbial networks allows one to predict pivotal species and inter-species interactions. In recent years, various network methods have been successfully applied in different biological contexts. Among them, the correlation-based association network approach is the most commonly used method for analyzing microbial interaction networks due to the simplicity and robustness of the computational process. Network analysis in some disciplines, especially medical-related ones, provides more options for further data mining and analysis. Therefore, we used network analysis methods to validate the identified flora’s reliability further.
The network analysis method is based on the concept of a network diagram in mathematics, and the microbial interaction network is constructed based on the Pearson correlation between all OTU species, and different correlation coefficients represent the difference in the relationship in different OTUs. Meanwhile, each network node corresponds to each OTU, i.e., colony species, and the edges between different species are determined by the pairwise Pearson correlations between species, i.e., the significant correlation between a certain bacterium and another bacterium. Ju et al. [45] ranked all the nodes in the network according to the degree from highest to lowest and selected the top ten nodes as the core nodes. Nodes, where the core module represent the core species in the global network. Therefore, the top ten OTU nodes corresponding to the degree of connectivity (Zi) within the module were selected as our core nodes in this study. These nodes represent the key species that may play an essential role in maintaining the structural stability of the microbial community, i.e., the core gut flora.

2.2.7. Evaluation Index of Normalization Algorithm

Intraclass correlation coefficient (ICC) [46] is often used to evaluate the reproducibility or consistency of different measurement methods or raters for the same quantitative measurement results.
ICC is defined as:
I C C = σ b 2 σ b 2 + σ ε 2
where σ b 2 represents the data variability between different normalization methods for the same sample type and σ ε 2 represents the variability between different sample types. ICC is calculated for the four types of sample data (“all samples”, “M”, “H” and “N”). The ICC was estimated by the R package “ICC”, and its value is close to 1 indicates the better reproducibility of the method.

2.2.8. Evaluation Index of Clustering Algorithm

In this paper, NMI, DBI, CH, and running time are employed to evaluate the performance of the clustering algorithm. These metrics are defined and formulated as follows:
  • Normalized Mutual information (NMI)
NMI [47], which determines clustering quality, is a common method. The more significant NMI value means better performance. The joint distribution of random variables X and Y is p(x,y), and the edge distribution is p(x) and p(y), respectively. The mutual information I(X,Y) is the relative entropy of the joint distribution p(x,y) and the product distribution p(x)p(y):
I ( X , Y ) = x , y p ( x , y ) log p ( x , y ) p ( x ) p ( y )
H ( X ) = i = 1 n p ( x i ) I ( x i ) = i = 1 n p ( x i ) log b 1 p ( x i ) = i = 1 n p ( x i ) log b p ( x i )
N M I ( X , Y ) = 2 R = 2 I ( X , Y ) H ( X ) + H ( Y )
2.
Davies-Boulding Index (DBI)
DBI, also known as the classification appropriateness index [48], is the maximum value of the sum of the average distance avg(C) between the samples of each of two clusters C i ,   C j divided by the distance between the centroids of the two clusters. The larger the inter-class distance, the better the clustering effect.
a v g ( C ) = 2 | C | ( | C | 1 ) 1 < i < j < | C | d i s t ( x i , x j )
D B I = 1 k i = 1 k i j max ( a v g ( C i ) + a v g ( C j ) d i s t ( u i , u j ) )
where avg(C) means the average distance of cluster class C, | C | means the number of cluster classes C, and dist(xi, xj) is the distance between two samples xi, xj, and ui, uj are the center of the cluster class C i ,   C j , respectively.
3.
Calinski-Harabasz index (CH)
The CH index is the ratio of inter-cluster distance to intra-cluster distance [49]. The larger the value CH(K), the better the clustering effect. The formula is as follows:
C H ( K ) t r ( B ) / ( K 1 ) t r ( W ) / ( N K )
where tr ( B ) = j = i k || z i -   z || 2 represents the trace of the inter-cluster distance difference matrix, tr ( W ) = j = i k x i ϵ k || x i z i || 2 represents the trace of the intra-cluster departure matrix, where z is the mean of the whole data set, zj is the mean of the jth cluster cj, N represents the number of clusters, and K is the current class.

3. Results

3.1. Reproducibility of GMPR

The normalization methods include GMPR, TMM, TMM+ (add a pseudocount for TMM) [50], RLE, RLE+ (add a pseudocount for RLE) [50] were employed to preprocess four different types of data of “all samples”, “H”, “N” and “M”. It has been seen from the Figure 2 that in different normalization methods, the ICC of “all samples” is larger than that of “H”, “N”, “M”. It shows that all samples achieve higher reproducibility in all the applied normalization methods. All samples obtained a larger sample size across all the sample styles, showing that reproducibility decreases as the number of samples decreases. The ICC of GMPR is higher than other methods in all normalization methods under any sample type. This indicates that the GMPR method is more robust and reproducibility than other normalization methods.

3.2. Cluster Number

To verify the performance of the used method, all samples is subjected to GMPR+Spectrum clustering. Since the algorithm performs an eigendecomposition of the constructed Laplacian matrix, solves for the eigenvectors and eigenvalues, and maximizes the difference of the eigenvalues (corresponding to the difference between the eigenvalues of two neighboring eigenvectors, i.e., the difference of the eigenvalues) [23]. Therefore, the optimal number of clusters for all samples represented in Figure 3 is 8.
GMPR+Spectrum clustering is performed for M, H, and N groups further to analyze the M, H, and N groups. As shown in Figure 4, the optimal number of clusters for the chronic cirrhotic patients is 2. The clustering of the remaining two groups is similar to Figure 4. Therefore, we can acquire the optimal number of clusters for the three groups.

3.3. Clustering Evaluation Indicators

GMPR+Spectrum classified all samples’ data for 1442 OTUs into 8 classes with NMI of 0.3641, DBI of 4.2359, CH of 24.4724, and running time of 26.75 s. The Spectrum without GMPR divided these data into 3 classes, and all metrics except DBI are lower than the GMPR+Spectrum. In addition, as shown in Table 1, the performance of M3C and iClusterPlus are inferior to GMPR+Spectrum. As performances of N, H, and M, the clustering evaluation indicators are shown in Table A2.

3.4. Core Microflora by GMPR+Spectrum (Genus)

All samples dataset was clustered into 8 classes using GMPR+Spectrum. The OTUs of Cluster1 contain 24 different genera, the OTUs of Cluster2 contain 31 different genera, the OTUs of Cluster3 contain 54 different genera, the OTUs of Cluster4 contain 38 different genera, the OTUs of Cluster5 contain 25 different genera, the OTUs of Cluster6 contain 21 different genera, the OTUs of Cluster7 contain 18 different genera, the OTUs of Cluster8 contain 30 different genera, and the detailed bacteria contain in each cluster can be found in Table A3.
In addition, M, H, and N groups were clustered into 2 classes by GMPR+Spectrum, and the core OTUs in each category were identified according to the score value in the algorithm. The score value represents the proportion of the variance of a certain OTU to the total variance, which is actually the proportion of a certain feature value to the sum of all feature values [23]. The larger the score value, the larger the contribution rate, indicating the stronger information of the original variables contained in that OTU. Therefore, the size of the score value is used as a measure to determine whether a certain OTU is a core colony.
The score values were calculated in OTUs, including cluster1 and 2 for M, H, and N groups. Since many bacteria were unlabeled and there were many duplicate bacteria, the bacteria with the high score values were used as representative bacteria. Thus, we identified the special core bacteria of group H containing mainly OTU280 (Herbaspirillum), OTU340 (Clostridium), and OTU373 (Ruminococcus), corresponding to scores of 0.130, 0.309, and 0.158, in that order. In addition, the important core bacteria of the M group were found to include OTU2 (Lactobacillus), OTU359 (Akkermansia), OTU280 (Herbaspirillum), and OTU428 (Acidaminococcus), with scores of 0.085, 0.438, 0.413, and 0.179, respectively. The score values for each OTU in M, H, and N groups can be seen in Table A4, Table A5 and Table A6. It was found that the core bacteria were concentrated in cluster3 of all samples in group N. The core flora of the M was concentrated in cluster7 of all samples except for the Herbaspirillum and the Akkermansia in the core flora of M were only distributed in cluster1 and cluster7. The core bacteria of H were all concentrated in cluster1 of all samples except for Herbaspirillum, and Pyramidobacter in this core group was only present in cluster1. In general, Herbaspirillum is only present in H, M but not in N. Pyramidobacter is only present in the key bacteria of H and Akkermansia is only present in the core bacteria of M. Furthermore, the bacteria in groups M, H, and N were all found in all samples, and the signature bacteria of each group were identified. Among them, all samples and M, H, and N groups were found to be clustered by GMPR+Spectrum to distinguish the similarity between the various populations in different OTUs, as well as to identify the differences in flora that exist between healthy individuals, patients with minimal hepatic encephalopathy and hepatic cirrhosis.

3.5. Network Analysis Core Flora (Genus Level)

In order to compare the structure with the core flora identified by GMPR+Spectrum, we also use the network analysis method to construct the gut flora interaction network among different OTUs and then take the MCODE method to identify and visualize the core gut microbiome contained in the interaction network for each group. The MCODE method calculates the adjacent subgraphs and graph densities contained in each node in the network graph, and the score value of a node reflects the density of the node and its surrounding nodes. Then the algorithm expands from the node with the maximum score value to the surrounding nodes, and the qualified nodes are added to this module and generate a module with similar clustering coefficients.
The intra-module connectivity (Zi) value is a measure of the role of a node in its module, and the larger the Zi value is the greater the role played by this node in that module, and then the top 10 OTUs in each module are considered as the core nodes within each key module according to the magnitude of Zi value corresponding to each OTU. Thus, we obtained the core gut microbiome network for modules 1, 2, and 3 (containing many core modules, but we only chose 3 modules to show here) of all samples’ group, as shown in Figure 5, where MCODE1 scored 5 and contained 5 nodes with 10 edges, each corresponding to a Zi value of 0.935. MCODE2, with a score of 3, contained 3 nodes and 3 edges, and the Zi values of nodes OTU1430, OTU1111, and OTU535 were 1.402, 1.351, and 1.351 in order. MCODE3 had a score of 5 and contained 5 nodes with 10 edges, and the Zi values of nodes OTU101, OTU183, OTU907, OTU1153, and OTU582 were 2.293, 2.156, 2.020, 1.951, 1.951, etc. The details of the core colonies contained in each module can be found in Table A7.
The details of core colonies and corresponding Zi values in patients with N, H, and M contained in modules 1, 2, and 3 can be found in Table A8, Table A9 and Table A10. The Zi values of nodes OTU6, OTU683, OTU658, OTU944, OTU406, OTU378, and OTU440 in H were 0.933, 0.906, 0.906, 0.701, 0.574, 0.574, 1.232 representing Coprococcus, Prevotella, Lachnospira, Parabacteroides, Streptococcus, and Clostridium, respectively. The Zi values of nodes OTU201, OTU1063, OTU861, OTU1425, OTU1237, OTU225, OTU202, OTU238, OTU1440, OTU1250, and OTU1383 in M were 1.145, 1.115, 1.115, 1.054, 1.054, 1.029, 1.029, 0.984, 0.843, 0.843, and 0.843, representing Ruminococcus, Bacteroides, Clostridium, Lachnospira, Faecalibacterium, Actinomyces, Coprococcus, Faecalibacterium, Veillonella, Sutterella, Oscillospira, respectively.
In the network analysis, the mean scores of the core gut microbiota of the normal, minimal hepatic encephalopathy and cirrhotic groups were 8.33, 9.33, and 10, respectively, with higher scores indicating more complex networks. In the three networks, we also found many similar bacteria among different groups, but the intestinal flora of the M group was more complex, and Prevotella, Lachnospira, and Veillonella were the key bacteria in the intestinal flora of the M group, which were not included in the core flora of normal subjects. Actinomyces, Sutterella, and Oscillospira are key bacteria in the intestinal flora of patients with mild hepatic encephalopathy, which are not included in the core flora of N and H groups. Streptococcus, as critical bacteria in the intestinal flora of patients with cirrhosis, was equally absent in the other two groups.

3.6. GMPR+Spectrum and Network Analysis Flora Comparison (Genus)

When comparing the core bacteria identified by GMPR+Spectrum and network analysis, it was found that many core bacteria co-exist in both methods. However, since GMPR+Spectrum and network analysis were two different methods, it was not guaranteed that each cluster of GMPR+Spectrum matches exactly with each module of network analysis, and the following situation may occur, for example, the core bacteria in cluster4 of GMPR+Spectrum appeared in module8 of network analysis method at the same time. The core bacteria in cluster1, cluster2, cluster3, cluster4, and cluster5 of GMPR+Spectrum could be found in module 6 of network analysis. The specific relationship can be seen in Figure 6.
A comparative analysis of the core bacteria included in GMPR+Spectrum and network analysis revealed that some core bacteria could be found in both methods, while some differences existed between the bacteria identified by the two methods. The common bacteria were Coprococcus, Clostridium of H, Faecalibacterium, Bacteroides, Prevotella in M, and Clostridium, Faecalibacterium, Fusobacterium, and Bacteroides in N. The difference was that Lactobacillus, Akkermansia, Herbaspirillum in M and Oscillospira, Dialister in H were found only in GMPR+Spectrum, etc.

4. Discussion

In this paper, GMPR+Spectrum was used to cluster the all samples dataset to analyze the structure of the intestinal flora. The sequencing data contains many zeros due to the missing or under-sampled intestinal flora in the sequencing process. Therefore, the GMPR method, which can effectively avoid the problem of zero inflation of the intestinal flora data, was first used to normalize the intestinal flora. Then Spectrum was used to analyze the structure of the intestinal flora. The results showed that the GMPR+Spectrum algorithm was the fastest compared with M3C and iClusterPlus on different groups and performed well. Moreover, most of the core clusters of the network analysis method were included in different clusters of GMPR+Spectrum.
In Spectrum, graph theory is used for algorithmic analysis, and the idea is to view the data analysis problem as a problem of optimal partitioning of graphs, while the network analysis method is based on the concept of network graphs in mathematics, where networks are also called “graphs”, and the idea is to view the data analysis problem as a problem of dividing a large network into smaller networks [51,52,53]. Therefore, the similarity between the two methods is that they both transform the data analysis problem into a graph, and the essence of both is to partition the graph, and the final result is to make the correlation between different subgraphs/subnetworks low and the correlation within the subgraphs/subnetworks high.
The differences are: (1) The way of calculating the similarity matrix is different. The Pearson correlation coefficient method is used for the network analysis method, while the adaptive density-aware kernel in Spectrum is used to calculate the similarity matrix. (2) The graph partitioning method is different. In Spectrum, the Laplace matrix is mainly used to turn the complete undirected graph into a subgraph. The score value of each node for subnetwork partitioning based on the MCODE is calculated by the network analysis method, which reflects the density of the node and the surrounding nodes. (3) In network analysis, the network is constructed based on the optimal threshold value, but the threshold value is artificially chosen. While for Spectrum in the clustering process, Ng spectral clustering method is used, and also the eigenvalue heuristic is used to estimate the number of clusters, and finally, the final eigenvector matrix of GMM clustering is utilized to obtain the optimal number of clusters. (4) In the Spectrum algorithm, the bacteria with the top ranking of score value are taken as the key bacteria. The score value represents the proportion of variance of a certain OTU to the total variance, which is actually the proportion of a certain eigenvalue to the total sum of all eigenvalues. Therefore, the larger score of an OTU, the greater contribution of that OTU to the total OTU. In the network analysis, the OTU with the highest Zi ranking is used as the core bacteria of each module, and the Zi value is a measure of the role of a node in the module where it is located. From the experimental results, it is clear that GMPR+Spectrum and network analysis can find the same bacteria in all types of populations, but given that there are still some differences between the two methods, it can also come out that different bacteria are found in the respective methods. Therefore, we performed another specific analysis for these common bacteria under the existing studies.
GMPR+Spectrum method identified all samples as well as in the flora of H, N, and M and found that the Herbaspirillum was only present in the core flora of H, M but not in N. In fact, Herbaspirillum belongs to Gram-negative bacilli and this bacterium can cause a decrease in the number of Bifidobacterium, further promoting chronic inflammation in the liver [54]. In addition, previous studies had mostly found Herbaspirillum in plants, and only in recent years had it been isolated in clinical patients [55,56,57,58,59,60,61]. In particular, in a study by Jia et al. same bacteria of Herbaspirillum were found to be a potential opportunistic pathogen for cirrhotic patients and some immunocompromised elderly patients [62]. Although few studies had been conducted on Herbaspirillum in humans, some studies had shown that Herbaspirillum was a potential opportunistic pathogen, meaning that Herbaspirillum may be a crucial bacterium for appropriate disease screening and diagnosis of clinical patients with cirrhosis and minimal hepatic encephalopathy.
In addition, Akkermansia was present only in the core bacteria of M in the GMPR+Spectrum method. Akkermansia is oval-shaped Gram-negative bacteria that are “probiotic” in many diseases [63], and researchers had seen their potential as the next generation of probiotic drugs that could be potential targets for improving metabolic diseases such as liver diseases. At the same time, some studies have shown that Akkermansia may have some negative effects [2]. For example, when the liver degenerates, metabolism will be destroyed, resulting in changes in the abundance of Akkermansia abundance [64]. A recent study by Bajaj et al. [65] found that Akkermansia change in healthy individuals and MHE patients, specifically Akkermansia are higher in the absence of MHE. In contrast, we have only found seen that Akkermansia may serve as a critical bacterium to distinguish minimal hepatic encephalopathy from normal individuals, and the specific immunomodulatory mechanism of action still needs to be further investigated subsequently.
The similarities and differences of the core bacteria identified by GMPR+Spectrum and network analysis methods could be found when comparing core flora. The similarity lies in the fact that normal healthy controls (N) have a more abundant flora than patients with minimal hepatic encephalopathy and hepatic encephalopathy, as well as in the fact that at the genus level, both methods can identify some common core flora, such as Clostridium, Ruminococcus as critical bacteria in cirrhotic patients, the effect of changes in Clostridium and Ruminococcus on the fecal microbiota of HE patients was confirmed in a study by Bajaj et al. [65], which shows that changes in fecal microbial composition occur in healthy individuals and HE patients, especially in Clostridium and Ruminococcus, and that changes in these bacteria are associated with the severity of cirrhosis and worsening of the complications of cirrhosis, but in the present study, the specific mechanism of action of Clostridium and Ruminococcus in this study is not clear. In addition, the difference between the two methods is that the core bacteria in the hepatic encephalopathy patients were Herbaspirillum, Pyramidobacter, Faecalibacterium, Fusobacterium, Dialister, and Bacteroides, which were only the core bacteria identified by GMPR+Spectrum.
At the same time, the critical flora of the M group at the genus level included Lactobacillus, Akkermansia, Herbaspirillum, and Acidaminococcus, which were identified by the GMPR+Spectrum method as the specific key bacteria. The GMPR+Spectrum method found Lactobacillus as a critical bacterium in the intestinal flora of patients with minimal hepatic encephalopathy, it did not include essential bacteria of cirrhosis, while in the Bajaj study [66], it was shown that Lactobacillus in the stool of MHE patients had unique characteristics and that these bacteria could be used for MHE patients for diagnosis. It has even been demonstrated [67] that microecological inhibitors containing Bifidobacteria and Lactobacillus can regulate the structure of the intestinal flora, inhibiting the growth of ammonia-producing, urease producing bacteria, and have a role in reducing the growth of ammonia. Minimal hepatic encephalopathy as the beginning of the pathogenesis of hepatic encephalopathy [68], so the difference between the two in Lactobacillus may serve as an effective way to differentiate between them.
In addition, GMPR+Spectrum has identified Coprococcus, Dialister in the core flora of cirrhotic patients, Prevotella, Acidaminococcus in the core flora of MHE patients, and which are also absent from the core flora of normal healthy controls, but the mechanism of action of these bacteria in patients with minimal hepatic encephalopathy and hepatic encephalopathy has not been identified in the current study. It is to be followed up with more in-depth studies. It shows [69] that there is a close relationship between the occurrence of MHE and bacteria that can affect ammonia. Some bacteria containing urease are associated with increased ammonia in MHE. Still, other bacteria sometimes have other hidden effects (such as causing some inflammation) in patients with minimal hepatic encephalopathy and hepatic encephalopathy. These bacteria also can promote the accumulation of ammonia, and it is possible that the differential bacteria found in this study were included.
However, this study also has some limitations. First, the experimental data is too small, and the intestinal flora is vulnerable to external environmental, genetic, and individual behavioral differences. Further multi-ethnic and long-term large-scale studies are needed to provide more controlled experiments to study the association between intestinal flora and diseases, which can further validate the performance of the Spectrum algorithm. Second, the next step will be to continue using the model used in this paper to uncover structural differences in the gut microflora in different populations and provide a reliable reference for different types of diseases based on research on aspects related to the gut flora.

5. Conclusions

In this study, we present a new method of GMPR+Spectrum to analyze the gut microbiome from the patients with MHE/HE. The results show that GMPR+Spectrum can more effectively identify structural differences in the gut microbiota of different patients, and extracting critical bacteria, and provide a reference for clinical screening and diagnosis of MHE/HE.

Author Contributions

Methodology, X.X.; formal analysis, X.X. and Y.R.; writing—review and editing, X.X. and Y.R.; data curation, Y.R.; visualization, Y.R. and J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number: 82060329, Yunnan Province Science and Technology Department Projects, grant number: 202101AT070310.

Institutional Review Board Statement

The study was approved by according to the Ethics Committee of the First People’s Hospital of Yunnan Province.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data is available from the Department of Gastroenterology, First People’s Hospital of Yunnan Province.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. OTU count table after sequencing.
Table A1. OTU count table after sequencing.
H1H2H3H4H5H7H8H9H10
OTU_0000010023
OTU_1000001000
OTU_2000000000
OTU_3000000010
OTU_40013000000
OTU_50180006000
OTU_600100201400
OTU_7000000000
OTU_8030173004200
OTU_922426312017401085
OTU_10100000012
Table A2. Clustering evaluation indicators in N, H, M.
Table A2. Clustering evaluation indicators in N, H, M.
IndexSpectrumGMPR+SpectrumM3CiClusterPlus
NNMI0.1524
2.7807
0.1521
3.0169
0.000460.2678
DBI2.83575.7933
CH2.59141.64330.01031.1882
Runtimes/second23.1820.39157.3638.69
Cluster number2243
HNMI0.1555
3.2233
0.1558
3.0738
0.000180.27001
DBI1.85806.1212
CH1.66812.381720.4511.2989
Runtimes/second17.1415.86151.7528.83
Cluster number2223
MNMI0.16230.16470.000590.2705
DBI2.73882.84202.95835.9948
CH2.71182.63390.01021.4089
Runtimes/second18.0015.29154.0633.76
Cluster number2233
Table A3. OTUs contained in each Cluster after GMPR+Spectrum.
Table A3. OTUs contained in each Cluster after GMPR+Spectrum.
ClusterOTU ID
Cluster1OTU8(Clostridium), OTU11(Dialiister), OTU12, OTU13(Megasphaera), OTU16(Blautia), OTU20(Coprococcus), OTU86(Ruminococcus), OTU106(Lachnobacterium), OTU112(Oscillospira),
OTU160(Bacteroides), OTU163(Prevotella), OTU203(Streptococcus), OTU281(Eubacterium),
OTU306(Alistipes), OTU313(Sutterella), OTU325(Epulopiscium), OTU363(Phascolarctobacterium),
OTU366(Pyramidobacter), OTU409(Fusobacterium), OTU509(Megamonas), OTU647(Roseburia),
OTU701(Lachnospira), OTU1821(Faecalibacterium), OTU2170(Enterobacter), OTU3123(Akkermansia)
Cluster2OTU15(Lactobacillus), OTU60, OTU170(Ruminococcus), OTU252(Faecalibactium), OTU299(Propionibacterium), OTU309(Oscillospira), OTU324(Brachybacterium), OTU373(Parabacteroides),
OTU374(Thermus), OTU426(Lachnospira), OTU434(Dialister), OTU448(Deinococcus), OTU495(Coprococcus), OTU515(Clostridium), OTU530(Megamonas), OTU534(Streptococcus), OTU696(Veillonella), OTU731(Roseburia), OTU760(Bacteroides), OTU884(Lachnobacterium), OTU1041(Blautia), OTU1237(Phascolarctobacterium), OTU1244(Desulfovibrio), OTU1279(Fusobacterium),
OTU1548(Eubacterium), OTU2149(Alistipes), OTU2563(Bulleidia), OTU2796(Campylobacter),
OTU2898(Brevundimonas), OTU2936(Leptotrichia), OTU3265(Methylobacterium), OTU3379 (Prevotella)
Cluster3OTU79(Clostridium), OTU81(Oscillospira), OTU88(Ruminococcus), OTU142(Anoxybacillus),
OTU153(Staphylococcus), OTU279(Paludibacter), OTU344(Herbaspirillum), OTU367
(Comamonas), OTU379(Acinetobacter), OTU388(Lactococcus), OTU490(Coprococcus),
OTU499(Dietzia), OTU514(Phascolarctobacterium), OTU531(Lactobacillus), OTU543(Eubacterium),
OTU546(Micrococcus), OTU586(Dialister), OTU605(Roseburia), OTU674(Veillonella), OTU691
(Faecalibacterium), OTU756(Blautia), OTU763(Selenomonas), OTU811(Lachnospira), OTU873
(Brevundimonas), OTU1104(Streptococcus), OTU1304(Prevotella), OTU1307(Megamonas),
OTU1331(Moryella), OTU1602(Odoribacter), OTU1612(Corynebacterium), OTU1637(Fusobacterium), OTU1683(Acidaminococcus), OTU1695(Bacteroides), OTU1725(Parabacteroides),
OTU1794(Gemella), OTU2000(Alistipes), OTU2197(Porphyromonas), OTU2419(Escherichia),
OTU2426(Sutterella), OTU2520(Brevibacterium), OTU2570(Morganella), OTU2704(Epulopiscium),
OTU2727(Enterobacter), OTU2731(Variovorax), OTU2797(Klebsiella), OTU2807(Adlercreutzia),
OTU2843(Atopobium), OTU2948(Chryseobacterium), OTU2963(Haloanella), OTU3000,
OTU3075(Coprobacillus), OTU3294(Methylobacterium), OTU3301(Sphingomonas), OTU3609
(Haemophilus)
Cluster4OTU0(Lactobacillus), OTU14(Enterococcus), OTU48, OTU49(Clostridium), OTU110(Megamonas),
OTU118(Streptococcus), OTU130(Gemella), OTU138(Parabacteroides), OTU143(Bacteroides),
OTU145(Prevotella), OTU164(Enterobacter), OTU166(Abiotrophia), OTU195(Lactococcus),
OTU263(Actinomyces), OTU305(Neisseria), OTU317(Microbacterium), OTU352(Rothia),
OTU376(Thermus), OTU386(Cetobacterium), OTU467(Escherichia), OTU562(Granulicatella),
OTU708(Lachnospira), OTU744(Eubacterium), OTU847(Methylobacterium), OTU1056(Lautropia),
OTU1060(Blautia), OTU1413(Oribacterium), OTU1666(Leuconostoc), OTU2454(Eikenella),
OTU2601(Coprococcus), OTU2632(Aggregatibacter), OTU2679(Haemophilus), OTU2862
(Adlercreutzia), OTU2909(Eggerthella), OTU2911(Campylobacter), OTU2953(Microvirgula),
OTU2980(Collinsella), OTU3039(Collinsella), OTU3591(Veillonella)
Cluster5OTU4(Weissella), OTU42(Clostridium), OTU58, OTU64(Coprococcus), OTU127(Oscillospira),
OTU190(Veillonella), OTU200(Ruminococcus), OTU218(Prevotella), OTU272(Odoribacter),
OTU283(Faecalibacterium), OTU286(Parabacteroides), OTU272(Odoribacter), OTU334
(Slackia), OTU339(Selenomonas), OTU361(Bacteroides), OTU403(Eubacterium), OTU428(Dialister), OTU624(Lachnospira), OTU635(Anaeroglobus), OTU664(Roseburia), OTU827(Phascolarctobacterium), OTU918(Blautia), OTU1412(Megamonas), OTU2119(Alistipes), OTU2500(Leptotrichia),
OTU3057, OTU3097 (Fusobacterium)
Cluster6OTU2(Lactobacillus), OTU6(Dialister), OTU9(Veillonella), OTU22(Lachnospira), OTU24(Roseburia),
OTU28(Megasphaera), OTU39(Coprococcus), OTU44, OTU47(Ruminococcus), OTU50(Clostridium),
OTU76(Phascolarctobacterium), OTU146(Bacteroides), OTU150(Prevotella), OTU219(Faecalibacterium), OTU238(Alistipes), OTU248(Parabacteroides), OTU265(Enterobacter), OTU301(Odoribacter),
OTU332(Sutterella), OTU362(Asteroleplasma), OTU419(Fusobacterium), OTU2047(Leclercia)
Cluster7OTU5(Coprococcus), OTU17(Veillonella), OTU23, OTU45(Clostridium), OTU70(Eubacterium),
OTU105(Oscillospira), OTU119(Ruminococcus), OTU230(Prevotella), OTU253(Faecalibacterium),
OTU259(Parabacteroides), OTU297(Haemophilus), OTU461(Akkermansia), OTU488(Megamonas),
OTU553(Lactobacillus), OTU555(Streptococcus), OTU565(Acidaminococcus), OTU2712(Fusobacterium), OTU3479(Escherichia), OTU167 (Bacteroides)
Cluster8OTU82(Roseburia), OTU83(Lachnospira), OTU114(Clostridium), OTU122, OTU229(Holdemania),
OTU256(Parabacteroides), OTU481(Megamonas), OTU491(Peptostreptococcus), OTU528(Faecalibacterium), OTU557(Coprococcus), OTU561(Blautia), OTU567(Veillonella), OTU580(Dialister),
OTU644(Ruminococcus), OTU693(Oscillospira), OTU1030(Prevotella), OTU1174(Desulfovibrio),
OTU1429(Actinomyces), OTU1627(Bacteroides), OTU1793(Streptococcus), OTU1810(Bilophila),
OTU1996(Oxalobacter), OTU2053(Alistipes), OTU2130(Odoribacter), OTU2597(Raoultella),
OTU2599(Epulopiscium), OTU2714(Fusobacterium), OTU2719(Sutterella), OTU2894(Sarcina),
OTU3056, OTU3062(Coprobacillus)
Table A4. The score value and core flora of GMPR+Spectrum in N.
Table A4. The score value and core flora of GMPR+Spectrum in N.
NOTUIDSCOREFamilyGenus
Cluster1OTU45
OTU104
0.528
0.478
LachnospiraceaeClostridium
StreptococcaceaeStreptococcus
OTU2300.475RuminococcaceaeOscillospira
OTU1250.330BacillaceaeAnoxybacillus
OTU20.068LactobacillaceaeLactobacillus
OTU1360.056StaphylococcaceaeStaphylococcus
OTU2740.018AlcaligenaceaeSutterella
Cluster2OTU313
OTU354
0.308
0.265
RuminococcaceaeFaecalibacterium
RuminococcaceaeOscillospira
OTU3260.235ClostridiaceaeClostridium
OTU3200.225RuminococcaceaeEubacterium
OTU3140.195ErysipelotrichaceaeClostridium
OTU3290.187FusobacteriaceaeFusobacterium
OTU3670.185BacteroidaceaeBacteroides
OTU4730.134LachnospiraceaeClostridium
Table A5. The score value and core flora of GMPR+Spectrum in H.
Table A5. The score value and core flora of GMPR+Spectrum in H.
HOTUIDSCOREFamilyGenus
Cluster1OTU141
OTU51
0.530
0.323
Erysipelotrichaceae
Lachnospiraceae
OTU1110.272RuminococcaceaeOscillospira
OTU2800.130OxalobacteraceaeHerbaspirillum
OTU2960.116DethiosulfovibrionaceaePyramidobacter
OTU2040.109VeillonellaceaeDialister
Cluster2OTU313
OTU407
0.593
0.348
RuminococcaceaeFaecalibacterium
RuminococcaceaeOscillospira
OTU3230.320LachnospiraceaeCoprococcus
OTU3400.309LachnospiraceaeClostridium
OTU3290.284FusobacteriaceaeFusobacterium
OTU3390.200VeillonellaceaeDialister
OTU4580.173BacteroidaceaeBacteroides
OTU3730.158LachnospiraceaeRuminococcus
Table A6. The score value and core flora of GMPR+Spectrum in M.
Table A6. The score value and core flora of GMPR+Spectrum in M.
MOTUIDSCOREFamilyGenus
Cluster1OTU2
OTU12
0.085
0.000
LactobacillaceaeLactobacillus
Veillonellaceae
Cluster2OTU168
OTU359
0.754
0.438
Veillonellaceae
VerrucomicrobiaceaeAkkermansia
OTU2800.413OxalobacteraceaeHerbaspirillum
OTU3640.333Ruminococcaceae
OTU4430.330RuminococcaceaeFaecalibacterium
OTU4040.319BacteroidaceaeBacteroides
OTU1550.255PrevotellaceaePrevotella
OTU4280.179VeillonellaceaeAcidaminococcus
Table A7. Core flora and corresponding Zi values in each module of all samples group in network analysis.
Table A7. Core flora and corresponding Zi values in each module of all samples group in network analysis.
SpectrumZiOTU IDFamilyGenus
module10.934501
0.934501
OTU31
OTU18
EnterococcaceaeEnterococcus
LactobacillaceaeLactobacillus
0.934501OTU700BurkholderiaceaeLautropia
0.934501OTU369StreptococcaceaeStreptococcus
0.934501OTU285MicrococcaceaeRothia
module21.402386OTU1430RuminococcaceaeFaecalibacterium
1.351226OTU1111BacteroidaceaeBacteroides
module32.292694
1.951359
OTU101
OTU1153
RuminococcaceaeOscillospira
Catabacteriaceae
1.951359OTU582Lachnospiraceae
module41.827142OTU1103BacteroidaceaeBacteroides
1.746558OTU180PrevotellaceaePrevotella
1.742641OTU571LachnospiraceaeLachnospira
1.742641OTU235RuminococcaceaeEubacterium
1.678291OTU1083Catabacteriaceae
module51.960498OTU372LachnospiraceaeRuminococcus
1.727345OTU1300ClostridiaceaeClostridium
1.494192OTU431Turicibacteraceae
module62.426804OTU493LachnospiraceaeLachnospira
2.310227OTU1420VeillonellaceaeVeillonella
2.310227OTU272CoriobacteriaceaeSlackia
1.960498OTU1187LachnospiraceaeCoprococcus
1.960498OTU281Ruminococcaceae
1.960498OTU90LachnospiraceaePseudobutyrivibrio
module71.235633OTU355BacteroidaceaeBacteroides
0.813126OTU356Lachnospiraceae
0.644123OTU854PrevotellaceaePrevotella
module80.668220OTU674StreptococcaceaeStreptococcus
0.668220OTU113GemellaceaeGemella
0.420731OTU286LactobacillaceaeLactobacillus
0.420731OTU1326NeisseriaceaeMicrovirgula
0.173242OTU658RuminococcaceaeClostridium
0.173242OTU593MethylobacteriaceaeMethylobacterium
Table A8. Core colonies and corresponding Zi values in each module of N group in network analysis.
Table A8. Core colonies and corresponding Zi values in each module of N group in network analysis.
NOTUIDZiFamilyGenus
MCODE1OTU216
OTU1110
1.107
1.052
RuminococcaceaeFaecalibacterium
RikenellaceaeAlistipes
OTU2221.052PorphyromonadaceaeParabacteroides
OTU13541.052FusobacteriaceaeFusobacterium
OTU13351.052ComamonadaceaeBrachymonas
OTU13341.052RuminococcaceaeRuminococcus
OTU12931.052ClostridiaceaeClostridium
OTU12861.052BacteroidaceaeBacteroides
MCODE2OTU1046
OTU268
1.382
1.382
BacteroidaceaeBacteroides
Ruminococcaceae
OTU1061.074LachnospiraceaeRuminococcus
OTU1801.030Clostridium
OTU9070.986Ruminococcaceae
OTU1010.986LactobacillaceaeLactobacillus
OTU660.986Lachnospiraceae
OTU10580.986BacteroidaceaeBacteroides
OTU6820.986LachnospiraceaeRoseburia
MCODE3OTU2021.285LachnospiraceaeCoprococcus
OTU13801.243Lachnospiraceae
OTU13331.243ErysipelotrichaceaeClostridium
OTU13241.243ComamonadaceaeVariovorax
OTU13101.243CoriobacteriaceaeAdlercreutzia
OTU12331.243Enterobacteriaceae
OTU11841.243RuminococcaceaeFaecalibacterium
OTU11661.243BacteroidaceaeBacteroides
Table A9. Core colonies and corresponding Zi values in each module of H group in network analysis.
Table A9. Core colonies and corresponding Zi values in each module of H group in network analysis.
HOTUIDZiFamilyGenus
MCODE1OTU368
OTU6
0.960
0.933
LachnospiraceaeCoprococcus
OTU10230.933 Lachnospiraceae
OTU6920.906 Ruminococcaceae
OTU6830.906 PrevotellaceaePrevotella
OTU6580.906 LachnospiraceaeLachnospira
OTU6320.906 Ruminococcaceae
OTU6280.906 Lachnospiraceae
OTU6190.906 Lachnospiraceae
OTU5590.906 Lachnospiraceae
MCODE2OTU1380
OTU1317
0.701
0.701
Lachnospiraceae
OTU10360.701
OTU9440.701 PorphyromonadaceaeParabacteroides
OTU8910.701
OTU6650.701 Lachnospiraceae
OTU4340.574 Lachnospiraceae
OTU4310.574 Lachnospiraceae
OTU4060.574 StreptococcaceaeStreptococcus
OTU3780.574 LachnospiraceaeCoprococcus
MCODE3OTU9121.656 PrevotellaceaePrevotella
OTU1341.514 PrevotellaceaePrevotella
OTU10091.408 PrevotellaceaePrevotella
OTU9351.373 PrevotellaceaePrevotella
OTU4941.302 LachnospiraceaeLachnospira
OTU9111.302 PrevotellaceaePrevotella
OTU8881.302 PrevotellaceaePrevotella
OTU4461.267 VeillonellaceaeVeillonella
OTU4401.232 RuminococcaceaeClostridium
OTU4221.232 LachnospiraceaeCoprococcus
Table A10. Core colonies and corresponding Zi values in each module of M group in network analysis.
Table A10. Core colonies and corresponding Zi values in each module of M group in network analysis.
MOTUIDZiFamilyGenus
MCODE1OTU201
OTU1063
1.145
1.115
RuminococcaceaeRuminococcus
BacteroidaceaeBacteroides
OTU8611.115RuminococcaceaeClostridium
OTU641.085LachnospiraceaeClostridium
OTU13921.054Ruminococcaceae
OTU14251.054LachnospiraceaeLachnospira
OTU13041.054LachnospiraceaeClostridium
OTU13781.054BacteroidaceaeBacteroides
OTU12371.054RuminococcaceaeFaecalibacterium
OTU11841.054RuminococcaceaeFaecalibacterium
MCODE2OTU1069
OTU1036
1.075
1.075
Ruminococcaceae
OTU2251.029ActinomycetaceaeActinomyces
OTU2021.029LachnospiraceaeCoprococcus
OTU5511.029Lachnospiraceae
OTU4651.029Erysipelotrichaceae
OTU3221.029LachnospiraceaeCoprococcus
OTU2380.984RuminococcaceaeFaecalibacterium
MCODE3OTU9560.843BacteroidaceaeBacteroides
OTU14400.843VeillonellaceaeVeillonella
OTU14000.843BacteroidaceaeBacteroides
OTU14090.843RuminococcaceaeRuminococcus
OTU12500.843AlcaligenaceaeSutterella
OTU13830.843RuminococcaceaeOscillospira
OTU11830.843BacteroidaceaeBacteroides
OTU11660.843BacteroidaceaeBacteroides
OTU11130.843PrevotellaceaePrevotella
OTU10210.843RuminococcaceaeFaecalibacterium

References

  1. Qin, J.; Li, R.; Raes, J.; Arumugam, M.; Burgdorf, K.S.; Manichanh, C.; Nielsen, T.; Pons, N.; Levenez, F.; Yamada, T.; et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 2010, 464, 59–65. [Google Scholar] [CrossRef] [Green Version]
  2. Qin, J.; Li, Y.; Cai, Z.; Li, S.; Zhu, J.; Zhang, F.; Liang, S.; Zhang, W.; Guan, Y.; Shen, D.; et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 2012, 490, 55–60. [Google Scholar] [CrossRef]
  3. Lambeth, S.M.; Carson, T.; Lowe, J.; Ramaraj, T.; Leff, J.W.; Luo, L.; Bell, C.J.; Shah, V.O. Composition, Diversity and Abundance of Gut Microbiome in Prediabetes and Type 2 Diabetes. J. Diabetes Obes. 2015, 2, 1–7. [Google Scholar] [CrossRef] [Green Version]
  4. Larsen, N.; Vogensen, F.K.; van den Berg, F.W.; Nielsen, D.S.; Andreasen, A.S.; Pedersen, B.K.; Al-Soud, W.A.; Sorensen, S.J.; Hansen, L.H.; Jakobsen, M. Gut microbiota in human adults with type 2 diabetes differs from non-diabetic adults. PLoS ONE 2010, 5, e9085. [Google Scholar] [CrossRef]
  5. Rajpal, D.K.; Klein, J.L.; Mayhew, D.; Boucheron, J.; Spivak, A.T.; Kumar, V.; Ingraham, K.; Paulik, M.; Chen, L.; Van Horn, S.; et al. Selective Spectrum Antibiotic Modulation of the Gut Microbiome in Obesity and Diabetes Rodent Models. PLoS ONE 2015, 10, e0145499. [Google Scholar] [CrossRef] [Green Version]
  6. Stewart, C.J.; Ajami, N.J.; O’Brien, J.L.; Hutchinson, D.S.; Smith, D.P.; Wong, M.C.; Ross, M.C.; Lloyd, R.E.; Doddapaneni, H.; Metcalf, G.A.; et al. Temporal development of the gut microbiome in early childhood from the TEDDY study. Nature 2018, 562, 583–588. [Google Scholar] [CrossRef]
  7. Vatanen, T.; Franzosa, E.A.; Schwager, R.; Tripathi, S.; Arthur, T.D.; Vehik, K.; Lernmark, A.; Hagopian, W.A.; Rewers, M.J.; She, J.X.; et al. The human gut microbiome in early-onset type 1 diabetes from the TEDDY study. Nature 2018, 562, 589–594. [Google Scholar] [CrossRef]
  8. Arumugam, M.; Raes, J.; Pelletier, E.; Le Paslier, D.; Yamada, T.; Mende, D.R.; Fernandes, G.R.; Tap, J.; Bruls, T.; Batto, J.M.; et al. Enterotypes of the human gut microbiome. Nature 2011, 473, 174–180. [Google Scholar] [CrossRef]
  9. Wu, G.D.; Chen, J.; Hoffmann, C.; Bittinger, K.; Chen, Y.Y.; Keilbaugh, S.A.; Bewtra, M.; Knights, D.; Walters, W.A.; Knight, R.; et al. Linking long-term dietary patterns with gut microbial enterotypes. Science 2011, 334, 105–108. [Google Scholar] [CrossRef] [Green Version]
  10. Holmes, I.; Harris, K.; Quince, C. Dirichlet multinomial mixtures: Generative models for microbial metagenomics. PLoS ONE 2012, 7, e30126. [Google Scholar] [CrossRef] [Green Version]
  11. Davey, J.W.; Hohenlohe, P.A.; Etter, P.D.; Boone, J.Q.; Catchen, J.M.; Blaxter, M.L. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat. Rev. Genet. 2011, 12, 499–510. [Google Scholar] [CrossRef]
  12. Dillies, M.A.; Rau, A.; Aubert, J.; Hennequet-Antier, C.; Jeanmougin, M.; Servant, N.; Keime, C.; Marot, G.; Castel, D.; Estelle, J.; et al. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. 2013, 14, 671–683. [Google Scholar] [CrossRef] [Green Version]
  13. Li, P.; Piao, Y.; Shon, H.S.; Ryu, K.H. Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data. BMC Bioinform. 2015, 16, 347. [Google Scholar] [CrossRef] [Green Version]
  14. McMurdie, P.J.; Holmes, S. Waste not, want not: Why rarefying microbiome data is inadmissible. PLoS Comput. Biol. 2014, 10, e1003531. [Google Scholar] [CrossRef] [Green Version]
  15. Chen, J.; King, E.; Deek, R.; Wei, Z.; Yu, Y.; Grill, D.; Ballman, K.; Stegle, O. An omnibus test for differential distribution analysis of microbiome sequencing data. Bioinformatics 2018, 34, 643–651. [Google Scholar] [CrossRef] [Green Version]
  16. Chen, L.; Reeve, J.; Zhang, L.; Huang, S.; Wang, X.; Chen, J. GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data. PeerJ 2018, 6, e4600. [Google Scholar] [CrossRef]
  17. Caporaso, J.G.; Kuczynski, J.; Stombaugh, J.; Bittinger, K.; Bushman, F.D.; Costello, E.K.; Fierer, N.; Pena, A.G.; Goodrich, J.K.; Gordon, J.I.; et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 2010, 7, 335–336. [Google Scholar] [CrossRef] [Green Version]
  18. Chen, J.; Bittinger, K.; Charlson, E.S.; Hoffmann, C.; Lewis, J.; Wu, G.D.; Collman, R.G.; Bushman, F.D.; Li, H. Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics 2012, 28, 2106–2113. [Google Scholar] [CrossRef]
  19. Jain, A.K.; Law, M. Data Clustering: A User’s Dilemma. In Proceedings of the International Conference on Pattern Recognition & Machine Intelligence, Kolkata, India, 20–22 December 2005. [Google Scholar]
  20. Larose, D.T.; Larose, C.D. Data preprocessing. In Discovering Knowledge in Data (An Introduction to Data Mining); John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2014; pp. 16–50. [Google Scholar]
  21. Wedding, D.K. Discovering knowledge in data, an introduction to data mining. Inf. Processing Manag. 2005, 41, 1307–1309. [Google Scholar] [CrossRef]
  22. von Luxburg, U. A tutorial on spectral clustering. Stat. Comput. 2007, 17, 395–416. [Google Scholar] [CrossRef]
  23. John, C.R.; Watson, D.; Barnes, M.R.; Pitzalis, C.; Lewis, M.J. Spectrum: Fast density-aware spectral clustering for single and multi-omic data. Bioinformatics 2020, 36, 1159–1166. [Google Scholar] [CrossRef] [Green Version]
  24. Groeneweg, M.; Moerland, W.; Quero, J.C.; Hop, W.C.; Krabbe, P.F.; Schalm, S.W. Screening of subclinical hepatic encephalopathy. J. Hepatol. 2000, 32, 748–753. [Google Scholar] [CrossRef]
  25. Saxena, N.; Bhatia, M.; Joshi, Y.K.; Garg, P.K.; Tandon, R.K. Auditory P300 event-related potentials and number connection test for evaluation of subclinical hepatic encephalopathy in patients with cirrhosis of the liver: A follow-up study. J. Gastroenterol. Hepatol 2001, 16, 322–327. [Google Scholar] [CrossRef]
  26. Schomerus, H.; Hamster, W. Quality of life in cirrhotics with minimal hepatic encephalopathy. Metab. Brain Dis. 2001, 16, 37–41. [Google Scholar] [CrossRef]
  27. Sharma, P.; Sharma, B.C.; Puri, V.; Sarin, S.K. Critical flicker frequency: Diagnostic tool for minimal hepatic encephalopathy. J. Hepatol. 2007, 47, 67–73. [Google Scholar] [CrossRef]
  28. Bajaj, J.S. Management options for minimal hepatic encephalopathy. Expert Rev. Gastroenterol. Hepatol. 2008, 2, 785–790. [Google Scholar] [CrossRef]
  29. Romero-Gomez, M.; Cordoba, J.; Jover, R.; del Olmo, J.A.; Ramirez, M.; Rey, R.; de Madaria, E.; Montoliu, C.; Nunez, D.; Flavia, M.; et al. Value of the critical flicker frequency in patients with minimal hepatic encephalopathy. Hepatology 2007, 45, 879–885. [Google Scholar] [CrossRef]
  30. Bajaj, J.S.; Saeian, K.; Verber, M.D.; Hischke, D.; Hoffmann, R.G.; Franco, J.; Varma, R.R.; Rao, S.M. Inhibitory control test is a simple method to diagnose minimal hepatic encephalopathy and predict development of overt hepatic encephalopathy. Am. J. Gastroenterol. 2007, 102, 754–760. [Google Scholar] [CrossRef]
  31. Ford, J.M.; Gray, M.; Whitfield, S.L.; Turken, A.U.; Glover, G.; Faustman, W.O.; Mathalon, D.H. Acquiring and inhibiting prepotent responses in schizophrenia: Event-related brain potentials and functional magnetic resonance imaging. Arch. Gen. Psychiatry 2004, 61, 119–129. [Google Scholar] [CrossRef] [Green Version]
  32. Schiff, S.; Vallesi, A.; Mapelli, D.; Orsato, R.; Pellegrini, A.; Umilta, C.; Gatta, A.; Amodio, P. Impairment of response inhibition precedes motor alteration in the early stage of liver cirrhosis: A behavioral and electrophysiological study. Metab. Brain Dis. 2005, 20, 381–392. [Google Scholar] [CrossRef]
  33. Weissenborn, K.; Ennen, J.C.; Schomerus, H.; Ruckert, N.; Hecker, H. Neuropsychological characterization of hepatic encephalopathy. J. Hepatol. 2001, 34, 768–773. [Google Scholar] [CrossRef]
  34. Ortiz, M.; Jacas, C.; Cordoba, J. Minimal hepatic encephalopathy: Diagnosis, clinical significance and recommendations. J. Hepatol. 2005, 42 (Suppl. 1), S45–S53. [Google Scholar] [CrossRef]
  35. Allampati, S.; Duarte-Rojo, A.; Thacker, L.R.; Patidar, K.R.; White, M.B.; Klair, J.S.; John, B.; Heuman, D.M.; Wade, J.B.; Flud, C.; et al. Diagnosis of Minimal Hepatic Encephalopathy Using Stroop EncephalApp: A Multicenter US-Based, Norm-Based Study. Am. J. Gastroenterol. 2016, 111, 78–86. [Google Scholar] [CrossRef]
  36. Lim, M.Y.; Song, E.-J.; Kim, S.H.; Lee, J.; Nam, Y.-D. Comparison of DNA extraction methods for human gut microbial community profiling. Syst. Appl. Microbiol. 2018, 41, 151–157. [Google Scholar] [CrossRef]
  37. Caporaso, J.G.; Lauber, C.L.; Walters, W.A.; Berg-Lyons, D.; Lozupone, C.A.; Turnbaugh, P.J.; Fierer, N.; Knight, R. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. USA 2011, 108 (Suppl. 1), 4516–4522. [Google Scholar] [CrossRef] [Green Version]
  38. John, C.R.; Watson, D.; Russ, D.; Goldmann, K.; Ehrenstein, M.; Pitzalis, C.; Lewis, M.; Barnes, M. M3C: Monte Carlo reference-based consensus clustering. Sci. Rep. 2020, 10, 1816. [Google Scholar] [CrossRef] [Green Version]
  39. Shen, R.; Olshen, A.B.; Ladanyi, M. Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics 2009, 25, 2906–2912. [Google Scholar] [CrossRef]
  40. Robinson, M.D.; Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010, 11, R25. [Google Scholar] [CrossRef] [Green Version]
  41. Anders, S.; Huber, W. Differential expression analysis for sequence count data. Genome Biol. 2010, 11, R106. [Google Scholar] [CrossRef] [Green Version]
  42. Monti, S.; Tamayo, P.; Mesirov, J.P.; Golub, T.R. Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Mach. Learn. 2003, 52, 91–118. [Google Scholar] [CrossRef]
  43. Senbabaoglu, Y.; Michailidis, G.; Li, J.Z. Critical limitations of consensus clustering in class discovery. Sci. Rep. 2014, 4, 6207. [Google Scholar] [CrossRef] [Green Version]
  44. Fang, K.T.; Wang, Y. Number-theoretic Methods in Statistics. In Number-theoretic Methods in Statistics; Chapman & Hall: London, UK, 1994. [Google Scholar]
  45. Ju, F.; Zhang, T. Bacterial assembly and temporal dynamics in activated sludge of a full-scale municipal wastewater treatment plant. ISME J. 2015, 9, 683–695. [Google Scholar] [CrossRef]
  46. Sinha, R.; Chen, J.; Amir, A.; Vogtmann, E.; Shi, J.; Inman, K.S.; Flores, R.; Sampson, J.; Knight, R.; Chia, N. Collecting Fecal Samples for Microbiome Analyses in Epidemiology Studies. Cancer Epidemiol. Biomark. Prev. 2016, 25, 407–416. [Google Scholar] [CrossRef] [Green Version]
  47. Zhang, P. Evaluating accuracy of community detection using the relative normalized mutual information. J. Stat. Mech.—Theory Exp. 2015, 2015, P11006. [Google Scholar] [CrossRef] [Green Version]
  48. Theodoridis, S.; Pikrakis, A.; Koutroumbas, K.; Cavouras, D. Introduction to Pattern Recognition: A Matlab Approach; Elsevier Inc.: Amsterdam, The Netherlands, 2010. [Google Scholar]
  49. Cengizler, C.; Kerem-Un, M. Evaluation of Calinski-Harabasz Criterion as Fitness Measure for Genetic Algorithm Based Segmentation of Cervical Cell Nuclei. Br. J. Math. Comput. Sci. 2017, 22, 1–13. [Google Scholar] [CrossRef] [Green Version]
  50. Mandal, S.; Van Treuren, W.; White, R.A.; Eggesbo, M.; Knight, R.; Peddada, S.D. Analysis of composition of microbiomes: A novel method for studying microbial composition. Microb. Ecol. Health Dis. 2015, 26, 27663. [Google Scholar] [CrossRef] [Green Version]
  51. Kim, S.; Thapa, I.; Lu, G.; Zhu, L.; Ali, H.H. A systems biology approach for modeling microbiomes using split graphs. In Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA, 13–16 November 2017; pp. 2062–2068. [Google Scholar]
  52. Pascoe, E.L.; Hauffe, H.C.; Marchesi, J.R.; Perkins, S.E. Network analysis of gut microbiota literature: An overview of the research landscape in non-human animal studies. ISME J. 2017, 11, 2644–2651. [Google Scholar] [CrossRef]
  53. Fattorusso, A.; Di Genova, L.; Dell’Isola, G.B.; Mencaroni, E.; Esposito, S. Autism spectrum disorders and the gut microbiota. Nutrients 2019, 11, 521. [Google Scholar] [CrossRef] [Green Version]
  54. Baldani, J.I.; Baldani, V.L.D.; Seldin, L.; Dobereiner, J. Characterization of Herbaspirillum seropedicae gen. nov., sp. nov., a Root-Associated Nitrogen-Fixing Bacterium. Int. J. Syst. Bacteriol. 1986, 36, 86–93. [Google Scholar] [CrossRef] [Green Version]
  55. Ziga, E.D.; Druley, T.; Burnham, C.A. Herbaspirillum species bacteremia in a pediatric oncology patient. J. Clin. Microbiol. 2010, 48, 4320–4321. [Google Scholar] [CrossRef] [Green Version]
  56. Chen, J.; Su, Z.; Liu, Y.; Sandoghchian, S.; Zheng, D.; Wang, S.; Xu, H. Herbaspirillum species: A potential pathogenic bacteria isolated from acute lymphoblastic leukemia patient. Curr. Microbiol. 2011, 62, 331–333. [Google Scholar] [CrossRef] [PubMed]
  57. Regunath, H.; Kimball, J.; Smith, L.P.; Salzer, W. Severe Community-Acquired Pneumonia with Bacteremia Caused by Herbaspirillum aquaticum or Herbaspirillum huttiense in an Immune-Competent Adult. J Clin. Microbiol. 2015, 53, 3086–3088. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  58. Chemaly, R.F.; Dantes, R.; Shah, D.P.; Shah, P.K.; Pascoe, N.; Ariza-Heredia, E.; Perego, C.; Nguyen, D.B.; Nguyen, K.; Modarai, F.; et al. Cluster and sporadic cases of herbaspirillum species infections in patients with cancer. Clin. Infect. Dis. 2015, 60, 48–54. [Google Scholar] [CrossRef] [PubMed]
  59. Suwantarat, N.; Adams, L.L.; Romagnoli, M.; Carroll, K.C. Fatal case of Herbaspirillum seropedicae bacteremia secondary to pneumonia in an end-stage renal disease patient with multiple myeloma. Diagn. Microbiol. Infect. Dis. 2015, 82, 331–333. [Google Scholar] [CrossRef] [PubMed]
  60. Tan, M.J.; Oehler, R.L. Lower Extremity Cellulitis and Bacteremia with Herbaspirillum seropedicae Associated with Aquatic Exposure in a Patient with Cirrhosis. Infect. Dis. Clin. Pract. 2005, 13, 277–279. [Google Scholar] [CrossRef]
  61. Spilker, T.; Uluer, A.Z.; Marty, F.M.; Yeh, W.W.; Levison, J.H.; Vandamme, P.; Lipuma, J.J. Recovery of Herbaspirillum species from persons with cystic fibrosis. J. Clin. Microbiol. 2008, 46, 2774–2777. [Google Scholar] [CrossRef] [Green Version]
  62. Marques, A.C.; Paludo, K.S.; Dallagassa, C.B.; Surek, M.; Pedrosa, F.O.; Souza, E.M.; Cruz, L.M.; LiPuma, J.J.; Zanata, S.M.; Rego, F.G.; et al. Biochemical characteristics, adhesion, and cytotoxicity of environmental and clinical isolates of Herbaspirillum spp. J. Clin. Microbiol. 2015, 53, 302–308. [Google Scholar] [CrossRef] [Green Version]
  63. Routy, B.; Le Chatelier, E.; Derosa, L.; Duong, C.P.M.; Alou, M.T.; Daillere, R.; Fluckiger, A.; Messaoudene, M.; Rauber, C.; Roberti, M.P.; et al. Gut microbiome influences efficacy of PD-1-based immunotherapy against epithelial tumors. Science 2018, 359, 91–97. [Google Scholar] [CrossRef] [Green Version]
  64. Anhe, F.F.; Nachbar, R.T.; Varin, T.V.; Vilela, V.; Dudonne, S.; Pilon, G.; Fournier, M.; Lecours, M.A.; Desjardins, Y.; Roy, D.; et al. A polyphenol-rich cranberry extract reverses insulin resistance and hepatic steatosis independently of body weight loss. Mol. Metab. 2017, 6, 1563–1573. [Google Scholar] [CrossRef]
  65. Bajaj, J.S.; Heuman, D.M.; Hylemon, P.B.; Sanyal, A.J.; White, M.B.; Monteith, P.; Noble, N.A.; Unser, A.B.; Daita, K.; Fisher, A.R.; et al. Altered profile of human gut microbiome is associated with cirrhosis and its complications. J. Hepatol. 2014, 60, 940–947. [Google Scholar] [CrossRef] [Green Version]
  66. Bajaj, J.S.; Fagan, A.; White, M.B.; Wade, J.B.; Hylemon, P.B.; Heuman, D.M.; Fuchs, M.; John, B.V.; Acharya, C.; Sikaroodi, M.; et al. Specific Gut and Salivary Microbiota Patterns Are Linked with Different Cognitive Testing Strategies in Minimal Hepatic Encephalopathy. Am. J. Gastroenterol. 2019, 114, 1080–1090. [Google Scholar] [CrossRef] [PubMed]
  67. Felipo, V.; Urios, A.; Montesinos, E.; Molina, I.; Garcia-Torres, M.L.; Civera, M.; Olmo, J.A.; Ortega, J.; Martinez-Valls, J.; Serra, M.A.; et al. Contribution of hyperammonemia and inflammatory factors to cognitive impairment in minimal hepatic encephalopathy. Metab. Brain Dis. 2012, 27, 51–58. [Google Scholar] [CrossRef] [PubMed]
  68. Karanfilian, B.V.; Park, T.; Senatore, F.; Rustgi, V.K. Minimal hepatic encephalopathy. Clin. Liver Dis. 2020, 24, 209–218. [Google Scholar] [CrossRef] [PubMed]
  69. Zhang, Z.; Zhai, H.; Geng, J.; Yu, R.; Ren, H.; Fan, H.; Shi, P. Large-scale survey of gut microbiota associated with MHE Via 16S rRNA-based pyrosequencing. Am. J. Gastroenterol. 2013, 108, 1601–1611. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Schematic diagram of flora data processing in the paper.
Figure 1. Schematic diagram of flora data processing in the paper.
Applsci 12 05895 g001
Figure 2. ICC of four samples under different normalization methods. The horizontal axis represents different normalization methods (Trimmed Mean of M values (TMM), add a pseudocount for TMM (TMM+), Relative Log Expression normalization (RLE), add a pseudocount for RLE (RLE+), and the vertical axis represents the ICC value.
Figure 2. ICC of four samples under different normalization methods. The horizontal axis represents different normalization methods (Trimmed Mean of M values (TMM), add a pseudocount for TMM (TMM+), Relative Log Expression normalization (RLE), add a pseudocount for RLE (RLE+), and the vertical axis represents the ICC value.
Applsci 12 05895 g002
Figure 3. The optimal number of clusters corresponding to the eigenvectors of the Laplacian operator in all samples. The horizontal axis represents the eigenvectors, and the vertical axis represents the eigenvalues.
Figure 3. The optimal number of clusters corresponding to the eigenvectors of the Laplacian operator in all samples. The horizontal axis represents the eigenvectors, and the vertical axis represents the eigenvalues.
Applsci 12 05895 g003
Figure 4. The optimal number of clusters corresponding to the eigenvectors of the Laplacian operator in hepatic encephalopathy. The horizontal axis represents the eigenvectors, and the vertical axis represents the eigenvalues.
Figure 4. The optimal number of clusters corresponding to the eigenvectors of the Laplacian operator in hepatic encephalopathy. The horizontal axis represents the eigenvectors, and the vertical axis represents the eigenvalues.
Applsci 12 05895 g004
Figure 5. Network diagram of the core gut microbiome of the all samples group. (a) represents module1. (b) represents module2. (c) represents module3.
Figure 5. Network diagram of the core gut microbiome of the all samples group. (a) represents module1. (b) represents module2. (c) represents module3.
Applsci 12 05895 g005
Figure 6. Correspondence between GMPR+Spectrum and network analysis of core flora in all samples group.
Figure 6. Correspondence between GMPR+Spectrum and network analysis of core flora in all samples group.
Applsci 12 05895 g006
Table 1. Clustering evaluation indicators of four different algorithms in all samples.
Table 1. Clustering evaluation indicators of four different algorithms in all samples.
IndexGMPR+SpectrumSpectrumM3CiClusterPlus
NMI0.36410.19320.00470.2623
DBI4.23592.73433.27427.4851
CH24.472414.49331.00001.0157
Runtimes/second26.7536.853096.19117.31
Cluster number8343
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Xiong, X.; Ren, Y.; He, J. Analysis of Gut Microbiome Structure Based on GMPR+Spectrum. Appl. Sci. 2022, 12, 5895. https://doi.org/10.3390/app12125895

AMA Style

Xiong X, Ren Y, He J. Analysis of Gut Microbiome Structure Based on GMPR+Spectrum. Applied Sciences. 2022; 12(12):5895. https://doi.org/10.3390/app12125895

Chicago/Turabian Style

Xiong, Xin, Yuyan Ren, and Jianfeng He. 2022. "Analysis of Gut Microbiome Structure Based on GMPR+Spectrum" Applied Sciences 12, no. 12: 5895. https://doi.org/10.3390/app12125895

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop