Here I want to see if the ecological niche (Forest vs. Savanna) is the same in these two papers or if there are any major differences.
First we need to load the two datasets
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.0.5
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.8
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## Warning: package 'ggplot2' was built under R version 4.0.4
## Warning: package 'tibble' was built under R version 4.0.4
## Warning: package 'tidyr' was built under R version 4.0.5
## Warning: package 'readr' was built under R version 4.0.5
## Warning: package 'purrr' was built under R version 4.0.3
## Warning: package 'dplyr' was built under R version 4.0.5
## Warning: package 'stringr' was built under R version 4.0.3
## Warning: package 'forcats' was built under R version 4.0.4
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(readxl)
## Warning: package 'readxl' was built under R version 4.0.3
Biomes.Gorel <- read.csv("../Anais_Bioclim_group.csv", sep = ";")
Biomes.Aleman <- read_excel("../../Papers/Aleman et al. 2020. Sup_Mat/pnas.2011515117.sd03.xlsx") %>% as.data.frame()
str(Biomes.Gorel)
## 'data.frame': 4142 obs. of 5 variables:
## $ Family : chr "Fabaceae" "Fabaceae" "Fabaceae" "Cucurbitaceae" ...
## $ Genus : chr "Abrus" "Abrus" "Abrus" "Acanthosicyos" ...
## $ Species : chr "Abrus canescens" "Abrus fruticulosus" "Abrus precatorius" "Acanthosicyos naudinianus" ...
## $ Biome : chr "Savanna" "Savanna" "Savanna" "Savanna" ...
## $ Subgroups: chr "Cold savanna" "Cold savanna" "Cold savanna" "Cold savanna" ...
str(Biomes.Aleman)
## 'data.frame': 1707 obs. of 5 variables:
## $ species: chr "Acacia_abyssinica" "Acacia_amythethophylla" "Acacia_ataxacantha" "Acacia_brevispica" ...
## $ genus : chr "Acacia" "Acacia" "Acacia" "Acacia" ...
## $ group : chr "Generalist" "Savanna" "Savanna" "Savanna" ...
## $ indval : num 0.00875 0.01342 0.11074 0.02914 0.07047 ...
## $ pvalue : num 1 0.0224 0.0001 0.0368 0.0001 ...
In their paper, Gorel et al. define a “Coastal” group, which isn’t really savanna, but clusters with savanna species more closely than it does with forest species. So for this case I will substitute all “Coastal” species as “Savanna” species.
Biomes.Gorel$Biome <- gsub("Coastal", "Savanna", Biomes.Gorel$Biome)
The two datasets are of different length, so we need to match the species present in Gorel et al. 2022 with those present in Aleman et al. 2020
# First need to add an underscore to species
Biomes.Gorel$Species<-gsub(pattern = " ", replacement = "_", x = Biomes.Gorel$Species)
Biomes.Gorel.2 <- Biomes.Gorel %>% filter(Species %in% Biomes.Aleman$species)
So how many species match up?
dim(Biomes.Gorel.2)
## [1] 1426 5
For some reason there are 281 that don’t match up. Which ones are these?
Biomes.Aleman %>% filter(!species %in% Biomes.Gorel.2$Species) %>% dplyr::select(species)
I’ve double checked and it seems like they genuinely aren’t present in Gorel et al. 2022.
So that still leaves us with 1426 names. First we will join the two dataframes. Then in Aleman et al. 2020, they have quite a few “Generalists”. We want to remove this category, as it wasn’t present in Gorel et al. 2022
Biomes.Aleman <- plyr::rename(Biomes.Aleman, c("species" = "Species"))
merged <- left_join(Biomes.Gorel.2, Biomes.Aleman, by = "Species") %>%
dplyr::select(Species, Biome, group) %>% filter(!group == "Generalist")
str(merged)
## 'data.frame': 1150 obs. of 3 variables:
## $ Species: chr "Acokanthera_schimperi" "Adansonia_digitata" "Afrocanthium_lactescens" "Afrocanthium_mundianum" ...
## $ Biome : chr "Savanna" "Savanna" "Savanna" "Savanna" ...
## $ group : chr "Savanna" "Savanna" "Savanna" "Savanna" ...
Now I want to know how many of the species categorised as “Savanna” in Gorel et al. 2022 are also categorise as “Savanna” in Aleman et al. 2020. Same goes for "Forest.
merged$sum <- if_else(merged$Biome == merged$group, 1, 0)
sum(merged$sum)
## [1] 1069
So out of 1150 species, 1069 are equally classified, that’s about 92%.
Which ones don’t match up?
merged <- merged %>% dplyr::rename(Gorel_Biome = Biome, Aleman_Biome = group)
merged %>% filter(sum == 0)