The figure shows the compounds concentrations before and after generalized log (glog) transformation and Pareto scaling (see step 3.3 of the script).
Figure caption: Scaling results
The figure shows in purple the 25 compounds that are outside the defined controls/patients fold-change thresholds of: >1.3 or <0.77 (see step 4 of the script).
Figure caption: Fold-Change analysis
Below are shown the 25 compounds and their fold-change values.
The figure shows all p-values obtained from the univariate t-tests analyses (see step 5 of the script).
P-values are FDR-corrected.
No compound was found significantly different between patients and controls.
Figure caption: t-tests plots displaying p-values with FDR correction
Below is shown the list of all p-values before and after FDR correction.
The figure shows the volcano plot combining the controls/patients fold-change analysis to p-values WITHOUT FDR correction here (see step 6 of the script).
Two compounds have a p-value before FDR correction <0.05 and a fold-change (controls/patients) above 1.3: PC(36:4) and PC-O(34:4)
Figure caption: Volcano plot displaying p-values WITHOUT FDR correction
Below are shown the two selected compounds, their fold-change values and raw p-values.
Below are shown the two box plots displaying the selected compounds concentrations in patients and controls (see step 9 of the script).
The PCA analysis reveals no separation of patients and controls (see step 7 of the script).
Figure caption: PCA scores plot
Figure caption: PCA scree plot
The first heatmap displays all compounds included int he analysis, while the second displays only the 10 compounds with lowest non-corrected p-value in the t-tests, i.e. <0.15 here (see step 8 of the script).
The heatmaps do not show a clear pattern differentiating patients and controls.
Figure caption: Heatmap displaying the selected 10 compounds
Reading the exported tables
library(tidyverse)
ft_fia_all <- read.table ("C:/ME-CFS/ME-CFS_MeTaQuaC_exported_data_FIA.csv", sep=",", header=TRUE, dec=".", skip = 1, row.names = 1)
ft_lc_all <- read.table ("C:/ME-CFS/ME-CFS_MeTaQuaC_exported_data_LC.csv", sep=",", header=TRUE, dec=".", skip = 1, row.names = 1)
Combining sample measurements for compounds acquired with LC-HRMS and FIA-HRMS
ft_fia <- ft_fia_all[-c(1,2,3,4), ]
ft_fia <- select(ft_fia, starts_with("sample"))
ft_lc <- ft_lc_all[-c(1,2,3), ]
ft_lc <- select(ft_lc, starts_with("sample"))
ft <- rbind(ft_lc, ft_fia)
Transposing the dataframe to have compounds in columns and samples in rows.
ft <- as.data.frame(t(ft))
ft$Type <- gsub("Case", "Patients", ft$Type)
ft$Type <- gsub("Control", "Controls", ft$Type)
Making sure the classes are correct (“factor” for Type and “numeric” for compound concentrations to be able to use the ft directly)
for(i in c(2:ncol(ft))) {
ft[,i] <- as.numeric(ft[,i])
}
ft[,1] <- as.factor(ft[,1])
sapply(ft, class)
## Type Ala Arg Asn Cit Gln Glu
## "factor" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## Gly His Ile Lys Met Orn Phe
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## Pro Ser Thr Trp Tyr Val xLeu
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## ADMA Creatinine Sarcosine SDMA Serotonin t4-OH-Pro Taurine
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## AC(0:0) AC(2:0) CE(16:1) CE(18:2) CE(18:3) CE(20:4) CE(20:5)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## CE(22:5) CE(22:6) DG(34:1) DG(36:2) DG(36:3) DG(36:4) DG(39:0)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## DG(41:1) TG(48:1) TG(48:2) TG(50:1) TG(50:2) TG(50:3) TG(50:4)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## TG(51:2) TG(51:3) TG(51:4) TG(52:2) TG(52:3) TG(52:4) TG(52:6)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## TG(53:4) TG(53:5) TG(54:3) TG(54:4) TG(54:5) TG(54:6) TG(56:6)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## TG(56:7) TG(56:8) LPC(16:0) LPC(16:1) LPC(18:0) LPC(18:1) LPC(18:2)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## LPC(20:4) LPC(22:6) PC-O(34:1) PC-O(34:2) PC-O(34:3) PC-O(34:4) PC-O(35:3)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## PC-O(36:2) PC-O(36:3) PC-O(36:4) PC-O(36:5) PC-O(38:4) PC-O(38:5) PC-O(38:6)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## PC-O(40:4) PC-O(40:5) PC-O(40:6) PC-O(42:4) PC-O(42:5) PC-O(42:6) PC-O(44:6)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## PC(29:0) PC(30:0) PC(32:0) PC(32:1) PC(32:2) PC(33:0) PC(33:1)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## PC(33:4) PC(34:1) PC(34:3) PC(34:4) PC(35:1) PC(35:2) PC(35:3)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## PC(35:4) PC(36:2) PC(36:3) PC(36:4) PC(37:1) PC(37:2) PC(37:4)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## PC(37:5) PC(37:6) PC(38:0) PC(38:1) PC(38:3) PC(38:4) PC(38:5)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## PC(38:6) PC(39:3) PC(39:5) PC(39:6) PC(40:2) PC(41:4) PC(42:7)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## PC(43:6) PC(44:1) PC(44:3) Cer(42:1) Cer(42:2) SM(32:1) SM(32:2)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## SM(33:1) SM(33:2) SM(34:1) SM(34:2) SM(35:1) SM(36:1) SM(36:2)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## SM(37:1) SM(38:1) SM(38:2) SM(39:1) SM(39:2) SM(40:2) SM(41:1)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## SM(41:2) SM(42:1) SM(42:2) SM(42:3) SM(43:1) SM(43:2) SM(44:1)
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Creating the CSV file that will be used by MetaboAnalystR
write.table(ft, 'C:/ME-CFS/ME-CFS_MeTaQuaC_exported_data.csv', sep=',',col.names = NA, row.names = TRUE, quote = TRUE, na="NA")
See the guide at https://www.metaboanalyst.ca/docs/RTutorial.xhtml
library(amap)
library(RJSONIO)
library("MetaboAnalystR")
packageVersion("MetaboAnalystR")
## [1] '3.0.3'
mSet<-InitDataObjects("conc", "stat", FALSE)
## Starting Rserve...
## "C:\Users\julc\DOCUME~1\R\WIN-LI~1\4.0\Rserve\libs\x64\Rserve.exe" --no-save
## [1] "MetaboAnalyst R objects initialized ..."
mSet<-Read.TextData(mSet, "C:/ME-CFS/ME-CFS_MeTaQuaC_exported_data.csv", "rowu", "disc")
3.1: Performing data processing - Data checking
mSet<-SanityCheckData(mSet)
## [1] "Successfully passed sanity check!"
## [2] "Samples are not paired."
## [3] "2 groups were detected in samples."
## [4] "Only English letters, numbers, underscore, hyphen and forward slash (/) are allowed."
## [5] "<font color=\"orange\">Other special characters or punctuations (if any) will be stripped off.</font>"
## [6] "All data values are numeric."
## [7] "A total of 19 (1.2%) missing values were detected."
## [8] "<u>By default, missing values will be replaced by 1/5 of min positive values of their corresponding variables</u>"
## [9] "Click the <b>Skip</b> button if you accept the default practice;"
## [10] "Or click the <b>Missing value imputation</b> to use other methods."
3.2: Performing data processing - Minimum Value Replacing
Missing values will be replaced by 1/5 of min positive values of their corresponding variables
mSet<-ReplaceMin(mSet)
3.3: Performing data processing - Transformation and Scaling
“Normalization” in MetaboAnalyst implies “row-wise” normalization, which is sample-based: we don’t apply any.
Then transformation (generalized log).
Then column-wise scaling (feature-based) including Pareto.
Pareto scaling is mean-centered and divided by the square root of the SD
mSet<-PreparePrenormData(mSet)
mSet<-Normalization(mSet, "NULL", "LogNorm", "ParetoNorm", ratio=FALSE, ratioNum=20)
Visualizing (saving the figures in your work directory)
mSet<-PlotNormSummary(mSet, "norm_0_", "png", 300, width=NA)
mSet<-PlotSampleNormSummary(mSet, "snorm_0_", "png", 300, width=NA)
Saving your tables in your working directory
mSet<-SaveTransformedData(mSet)
mSet<-FC.Anal.unpaired(mSet, 1.3, 0)
mSet<-PlotFC(mSet, "fc_0_", "png", 300, width=NA)
mSet <- Ttests.Anal(mSet, FALSE, 0.1, FALSE, TRUE, TRUE)
## [1] "A total of 0 significant features were found."
mSet <- PlotTT(mSet, "tt_0_", "png", 300, width=NA)
mSet<-Volcano.Anal(mSet, FALSE, 1.3, 0, 0.75,F, 0.05, TRUE, "raw")
mSet<-PlotVolcano(mSet, "volcano_0_",1, "png", 300, width=NA)
mSet<-PCA.Anal(mSet)
mSet<-PlotPCAPairSummary(mSet, "pca_pair_0_", "png", 300, width=NA, 5)
mSet<-PlotPCAScree(mSet, "pca_scree_0_", "png", 300, width=NA, 5)
mSet<-PlotPCA2DScore(mSet, "pca_score2d_0_", "png", 300, width=NA, 1,2,0.95,0,0)
mSet<-PlotPCALoading(mSet, "pca_loading_0_", "png", 300, width=NA, 1,2);
mSet<-PlotPCABiplot(mSet, "pca_biplot_0_", "png", 300, width=NA, 1,2)
mSet<-PlotPCA3DLoading(mSet, "pca_loading3d_0_", "json", 1,2,3)
8.1: full list of compounds
mSet<-PlotHeatMap(mSet, "heatmap_0_", "png", 300, width=NA, "norm", "row", "euclidean", "complete","bwm", "overview", T, T, NULL, T, F)
8.2: displaying only the 10 compounds with lowest p-values without FDR correction in the t-tests, i.e. here <0.15.
mSet<-PlotSubHeatMap(mSet, "heatmap_10comp_", "png", 300, width=NA, "norm", "row", "euclidean", "complete","bwm", "tanova", 10, "overview", T, T, T, F)
pdf(file="PC364_PC-O344_boxplots.pdf", width=10, height=5)
p1 <- ggplot2::ggplot(ft, ggplot2::aes(x = Type, y = ft[,109], color = Type)) +
ggplot2::geom_boxplot(outlier.shape = NA) +
ggplot2::geom_jitter(shape = 16, position = position_jitter(0.2)) +
ggplot2::labs(y = "Concentrations in ng/mL") +
ggplot2::labs(title = "PC(36:4)") +
ggplot2::scale_x_discrete(labels = c(Controls = "Controls", Patients = "Patients")) +
ggplot2::scale_colour_manual(values = c(rgb(96, 108, 158, maxColorValue = 255),
rgb(193, 88, 88, maxColorValue = 255))) +
ggplot2::theme_bw() +
ggplot2::theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black", size = 0.75),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20,
margin = margin(t = 0, r = 20, b = 0, l = 0)),
axis.text = element_text(colour = "black", size = 18),
plot.title = element_text(hjust = 0.5, size = 20),
legend.position = "none")
p2 <- ggplot2::ggplot(ft, ggplot2::aes(x = Type, y = ft[,76], color = Type)) +
ggplot2::geom_boxplot(outlier.shape = NA) +
ggplot2::geom_jitter(shape = 16, position = position_jitter(0.2)) +
ggplot2::labs(y = "Concentrations in ng/mL") +
ggplot2::labs(title = "PC-O(34:4)") +
ggplot2::scale_x_discrete(labels = c(Controls = "Controls", Patients = "Patients")) +
ggplot2::scale_colour_manual(values = c(rgb(96, 108, 158, maxColorValue = 255),
rgb(193, 88, 88, maxColorValue = 255))) +
ggplot2::theme_bw() +
ggplot2::theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black", size = 0.75),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20,
margin = margin(t = 0, r = 20, b = 0, l = 0)),
axis.text = element_text(colour = "black", size = 18),
plot.title = element_text(hjust = 0.5, size = 20),
legend.position = "none")
grid.arrange(p1, p2, nrow = 1)
dev.off()
## png
## 2