| Title: | Test for Differences in Diversification Rates over Time |
|---|---|
| Description: | Employ time-calibrated phylogenies and trait/range data to test for differences in diversification rates over evolutionary time. Extend the STRAPP test from BAMMtools::traitDependentBAMM() to any time step along phylogenies. See inst/COPYRIGHTS for details on third-party code. |
| Authors: | Maël Doré [aut, cre, cph] (ORCID: <https://orcid.org/0000-0003-3242-9242>), Dan Rabosky [ctb], Mike Grundler [ctb], Huateng Huang [ctb], Pascal Title [ctb], Liam Revell [ctb], Nicholas J. Matzke [ctb] |
| Maintainer: | Maël Doré <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-05-20 09:09:30 UTC |
| Source: | https://github.com/maeldore/deepstrapp |
Template file for BAMM diversification analyses provided as character strings.
Source: bamm-2.5.0 References: http://bamm-project.org/; https://github.com/macroevolution/bamm
data(BAMM_template_diversification)data(BAMM_template_diversification)
A vector of 260 character strings.
A vector of 260 character strings that can be displayed with print(BAMM_template_diversification).
This is the template used to generate the 'config_file.txt' controlling settings used for a BAMM run.
It provides detailed explanations of the additional_BAMM_settings that can be used in prepare_diversification_data()
to customize the BAMM run.
It is called internally by prepare_diversification_data() to produce
the custom 'config_file' used in the subsequent BAMM run.
Authors: Daniel Rabosky (BAMM) & Pascal Title ({BAMMtools}). Modified by Maël Doré for deepSTRAPP.
BAMM software website: http://bamm-project.org/
These functions converts a Biogeographic Stochastic Map (BSM) output from BioGeoBEARS into
a simmap object from R package {phytools} (See phytools::make.simmap()).
They require a model fit with BioGeoBEARS::bears_optim_run() and the output of a Biogeographic Stochastic Mapping
performed with BioGeoBEARS::runBSM() to produce simmap objects as phylogenies with the associated
mapping of range evolution along branches across simulations.
BSM_to_phytools_simmap(): Produce one simmap for the required simulation (index of the simulation provided with sim_index).
BSMs_to_phytools_simmaps(): Produce all simmap objects for all simulations stored in a unique multiSimmap object.
Initial functions in R package BioGeoBEARS by Nicholas J. Matzke:
BioGeoBEARS::BSM_to_phytools_SM()
BioGeoBEARS::BSMs_to_phytools_SMs()
BSM_to_phytools_simmap(model_fit, phylo, BSM_output, sim_index) BSMs_to_phytools_simmaps(model_fit, phylo, BSM_output)BSM_to_phytools_simmap(model_fit, phylo, BSM_output, sim_index) BSMs_to_phytools_simmaps(model_fit, phylo, BSM_output)
model_fit |
A BioGeoBEARS results object, produced by ML inference via |
phylo |
Time-calibrated phylogeny used in the BioGeoBEARS analyses to produce the historical biogeographic inference
and run the Biogeographic Stochastic Mapping. Object of class |
BSM_output |
A list with two objects, a cladogenetic events table and an anagenetic events table, as the result of
Biogeographic Stochastic Mapping conducted with |
sim_index |
Integer. Index of the biogeographic simulation targeted to produce the |
These functions are slight adaptations of original functions from the R Package BioGeoBEARS by N. Matzke.
Initial functions: BioGeoBEARS::BSM_to_phytools_SM() BioGeoBEARS::BSMs_to_phytools_SMs()
Changes:
Solves issue with differences in ranges allowed across time-strata.
Requires directly the output of BioGeoBEARS::runBSM() instead of separated cladogenetic and anagenetic event tables.
Update the documentation.
The BSM_to_phytools_simmap() function returns a list with two elements:
$simmap A unique simmap for a given biogeographic simulation as an object of classes c("simmap", "phylo").
This is a modified {ape} tree with additional elements to report range mapping, model parameters and likelihood.
$maps A list of named numerical vectors. Provides the mapping of ranges along each remaining edge.
Names are the ranges. Values are residence times in each state across segments
$mapped.edge A numerical matrix. Provides the evolutionary time spent across ranges (columns) along the edges (rows).
row.names() are the node ID at the rootward and tipward ends of each edge.
$Q Numerical matrix. The transition rates across ranges calculated from the ML parameter estimates of the model.
$logL Numeric. The log-likelihood of the data under the ML model.
$residence_times Data.frame with two rows. Summarizes the residence time spent in each range along all branches,
in (raw) evolutionary time (i.e., branch lengths), and in percentage (perc).
The BSMs_to_phytools_simmaps() function loop around the BSM_to_phytools_simmap() function to aggregate all simmaps
from all biogeographic simulations in a unique list of classes c("multiSimmap", "multiPhylo").
Each element in the $simmap of a biogeographic simulation obtained with BSM_to_phytools_simmap().
$residence_times summary data.frames are not preserved.
The R package BioGeoBEARS is needed for this function to work with biogeographic data.
Please install it manually from: https://github.com/nmatzke/BioGeoBEARS.
The phytools functions, like phytools::countSimmap(), will only count the anagenetic events
(range transitions occurring along branches) as it was written assuming purely anagenetic models.
It remains possible to extract cladogenetic events (range transitions occurring at speciation)
by comparing the last-state-below-a-node with the descendant-pairs-above-a-node.
However, it is recommended to use the built-in functions from BioGeoBEARS to summarize
the biogeographic history based on the tables of cladogenetic and anagenetic events obtained
from BioGeoBEARS::runBSM(). simmap objects should primarily be considered as a tool for visualization.
Associated functions in R package BioGeoBEARS:
BioGeoBEARS::simulate_source_areas_ana_clado(): To select randomly a unique area source for transition from a multi-area state to a single area.
BioGeoBEARS::get_dmat_times_from_res(): To generate matrices of range expansion from source area to destination area.
BioGeoBEARS::count_ana_clado_events(): To count the number and type of events from BSM tables.
BioGeoBEARS::hist_event_counts(): To plot histograms of event counts across BSM tables.
Please note carefully that area-to-area dispersal events are not identical with the state transitions.
For example, a state can be a geographic range with multipleareas, but under the logic of DEC-type models,
a range-expansion event like ABC->ABCD actually means that a dispersal happened from some specific area (A, B, or C)
to the new area. BSMs track this area-to-area sourcing in its cladogenetic and anagenetic event tables,
at least if BioGeoBEARS::simulate_source_areas_ana_clado() has been run on the output of BioGeoBEARS::runBSM().
Nicholas J. Matzke. Contact: [email protected]
Changes by Maël Doré (see Details)
For BioGeoBEARS: Matzke, Nicholas J. (2018). BioGeoBEARS: BioGeography with Bayesian (and likelihood) Evolutionary Analysis with R Scripts. version 1.1.1, published on GitHub on November 6, 2018. doi:10.5281/zenodo.1478250. Website: http://phylo.wikidot.com/biogeobears.
phytools::countSimmap() phytools::make.simmap()
BioGeoBEARS::simulate_source_areas_ana_clado() BioGeoBEARS::get_dmat_times_from_res()
BioGeoBEARS::count_ana_clado_events() BioGeoBEARS::hist_event_counts()
if (deepSTRAPP::is_dev_version()) { ## Run only if you have R package 'BioGeoBEARS' installed. # Please install it manually from: https://github.com/nmatzke/BioGeoBEARS") ## Load phylogeny and tip data library(phytools) data(eel.tree) # (May take several minutes to run) ## Load directly output of prepare_trait_data() run on biogeographic data data(eel_biogeo_data, package = "deepSTRAPP") ## Convert BSM output into a unique simmap, including residence times simmap_1 <- BSM_to_phytools_simmap(model_fit = eel_biogeo_data$best_model_fit, phylo = eel.tree, BSM_output = eel_biogeo_data$BSM_output, sim_index = 1) # Explore output str(simmap_1, max.level = 1) # Print residence times in each range simmap_1$residence_times # Plot simmap plot(simmap_1$simmap) ## Convert BSM output into all simmaps in a multiSimmap/multiPhylo object all_simmaps <- BSMs_to_phytools_simmaps(model_fit = eel_biogeo_data$best_model_fit, phylo = eel.tree, BSM_output = eel_biogeo_data$BSM_output) # Explore output str(all_simmaps, max.level = 1) # Plot simmap n°1 plot(all_simmaps[[1]]) }if (deepSTRAPP::is_dev_version()) { ## Run only if you have R package 'BioGeoBEARS' installed. # Please install it manually from: https://github.com/nmatzke/BioGeoBEARS") ## Load phylogeny and tip data library(phytools) data(eel.tree) # (May take several minutes to run) ## Load directly output of prepare_trait_data() run on biogeographic data data(eel_biogeo_data, package = "deepSTRAPP") ## Convert BSM output into a unique simmap, including residence times simmap_1 <- BSM_to_phytools_simmap(model_fit = eel_biogeo_data$best_model_fit, phylo = eel.tree, BSM_output = eel_biogeo_data$BSM_output, sim_index = 1) # Explore output str(simmap_1, max.level = 1) # Print residence times in each range simmap_1$residence_times # Plot simmap plot(simmap_1$simmap) ## Convert BSM output into all simmaps in a multiSimmap/multiPhylo object all_simmaps <- BSMs_to_phytools_simmaps(model_fit = eel_biogeo_data$best_model_fit, phylo = eel.tree, BSM_output = eel_biogeo_data$BSM_output) # Explore output str(all_simmaps, max.level = 1) # Plot simmap n°1 plot(all_simmaps[[1]]) }
Carries out the appropriate statistical method to test for a relationship between
diversification rates and trait data for a given point in the past (i.e. the focal_time).
Tests are based on block-permutations: rates data are randomized across tips following blocks
defined by the diversification regimes identified on each tip (typically from a BAMM).
Such tests are called STructured RAte Permutations on Phylogenies (STRAPP) as described in Rabosky, D. L., & Huang, H. (2016). A robust semi-parametric test for detecting trait-dependent diversification. Systematic biology, 65(2), 181-193. doi:10.1093/sysbio/syv066.
The function is an extension of the original BAMMtools::traitDependentBAMM() function used to
carry out STRAPP test on extant time-calibrated phylogenies.
Tests can be carried out on speciation, extinction and net diversification rates.
deepSTRAPP::compute_STRAPP_test_for_focal_time() can handle three types of statistical tests depending on the type of trait data provided:
Tests for correlations between trait and rates carried out with deepSTRAPP::compute_STRAPP_test_for_continuous_data().
The associated test is the Spearman's rank correlation test (See stats::cor.test).
For categorical and biogeographic trait data that have only two states (ex: 'Nearctic' vs. 'Neotropics').
Tests for differences in rates between states are carried out with deepSTRAPP::compute_STRAPP_test_for_binary_data().
The associated test is the Mann-Whitney-Wilcoxon rank-sum test (See stats::wilcox.test).
For categorical and biogeographic trait data with more than two states (ex: 'No leg' vs. 'Two legs' vs. 'Four legs').
Tests for differences in rates between states are carried out with deepSTRAPP::compute_STRAPP_test_for_multinominal_data().
The associated test for all states is the Kruskal-Wallis H test (See stats::kruskal.test).
If posthoc_pairwise_tests = TRUE, post hoc pairwise tests between pairs of states will be carried out too.
The associated test for post hoc pairwise tests is the Dunn's post hoc pairwise rank-sum test (See dunn.test::dunn.test).
compute_STRAPP_test_for_focal_time( BAMM_object, trait_data_list, rate_type = "net_diversification", seed = NULL, nb_permutations = NULL, replace_samples = FALSE, alpha = 0.05, two_tailed = TRUE, one_tailed_hypothesis = NULL, posthoc_pairwise_tests = FALSE, p.adjust_method = "none", return_perm_data = FALSE, nthreads = 1, print_hypothesis = TRUE )compute_STRAPP_test_for_focal_time( BAMM_object, trait_data_list, rate_type = "net_diversification", seed = NULL, nb_permutations = NULL, replace_samples = FALSE, alpha = 0.05, two_tailed = TRUE, one_tailed_hypothesis = NULL, posthoc_pairwise_tests = FALSE, p.adjust_method = "none", return_perm_data = FALSE, nthreads = 1, print_hypothesis = TRUE )
BAMM_object |
Object of class |
trait_data_list |
List obtained from |
rate_type |
A character string specifying the type of diversification rates to use. Must be one of 'speciation', 'extinction' or 'net_diversification' (default). |
seed |
Integer. Set the seed to ensure reproducibility. Default is |
nb_permutations |
Integer. To select the number of random permutations to perform during the tests. If NULL (default), all posterior samples will be used once. |
replace_samples |
Logical. To specify whether to allow 'replacement' (i.e., multiple use) of a posterior sample when drawing samples used to carry out the test. Default is |
alpha |
Numerical. Significance level to use to compute the |
two_tailed |
Logical. To define the type of tests. If
|
one_tailed_hypothesis |
A character string specifying the alternative hypothesis in the one-tailed test. For continuous data, it is either "negative" or "positive" correlation. For binary data, it lists the trait states with states ordered in increasing rates under the alternative hypothesis, separated by a greater-than such as c('A > B'). |
posthoc_pairwise_tests |
Logical. Only for multinominal data (with more than two states). If |
p.adjust_method |
A character string. Only for multinominal data (with more than two states). It specifies the type of correction to apply to the p-values
in the post hoc pairwise tests to account for multiple comparisons. See |
return_perm_data |
Logical. Whether to return the stats data computed from the posterior samples for observed and permuted data in the output.
This is needed to plot the histogram of the null distribution used to assess significance of the test with |
nthreads |
Integer. Number of threads to use for paralleled computing of the tests across the permutations. The R package |
print_hypothesis |
Logical. Whether to print information on what test is carried out, detailing the null and alternative hypotheses,
and what significant level is used to rejected or not the null hypothesis. Default is |
These set of functions carries out the STructured RAte Permutations on Phylogenies (STRAPP) test as defined in Rabosky, D. L., & Huang, H. (2016). A robust semi-parametric test for detecting trait-dependent diversification. Systematic biology, 65(2), 181-193.
It is an extension of the original BAMMtools::traitDependentBAMM() function used to
carry out STRAPP test on extant time-calibrated phylogenies, but allowing here to test for
differences/correlations at any point in the past (i.e. the focal_time).
It takes an object of class "bammdata" (BAMM_object) that was updated such as
its diversification rates ($tipLambda and $tipMu) and regimes ($tipStates) are reflecting
values observed at at a specific time in the past (i.e. the $focal_time).
Similarly, it takes a list (trait_data_list) that provides $trait_data as observed on branches
at the same focal_time than the diversification rates and regimes.
A STRAPP test is carried out by drawing a random set of posterior samples from the BAMM_object, then randomly permuting rates
across blocks of tips defined by the macroevolutionary regimes. Test statistics are then computed across the initial observed data
and the permuted data for each sample.
In a two-tailed test, the p-value is the proportion of posterior samples in which the test stats is as extreme in the permuted than in the observed data.
In a one-tailed test, the p-value is the proportion of posterior samples in which the test stats is higher in the permuted than in the observed data.
———- Major changes compared to BAMMtools::traitDependentBAMM() ———-
Allow to choose if random sampling of posterior configurations must be done with replacement or not with replace_samples.
Add post hoc pairwise tests (Dunn test) for multinominal data. Use posthoc_pairwise_tests = TRUE.
Provide outputs tailored for histogram plots plot_histogram_STRAPP_test_for_focal_time()
and p-value time-series plots plot_STRAPP_pvalues_over_time().
Add prints detailing what test is carried out, what are the null and alternative hypotheses,
and what significant level is used to rejected or not the null hypothesis. (Enabled with print_hypothesis = TRUE).
Split the function in multiple sub-functions according to the type of data ($trait_data_type).
Prevent using Pearson's correlation tests and applying log-transformation for continuous data. The rationale is that there is no reason to assume that tip rates are distributed normally or log-normally. Thus, a Spearman's rank correlation test is favored.
The function returns a list with at least eight elements.
Summary elements for the main test:
$estimate Named numerical. Value of the test statistic used to assess significance of the test
according to the significance level provided (alpha). The test is significant if $estimate is higher than zero.
$stats_median Numerical. Median value of the distribution of test statistics across all selected posterior samples.
$p-value Numerical. P-value of the test. The test is considered significant if $p-value is lower than alpha.
$method Character string. The statistical method used to carry out the test.
$rate_type Character string. The type of diversification rates tested. One of 'speciation', 'extinction' or 'net_diversification'.
$trait_data_type Character string. The type of trait data as found in 'trait_data_list$trait_data_type'. One of 'continuous', 'categorical', or 'biogeographic'.
$trait_data_type_for_stats Character string. The type of trait data used to select statistical method. One of 'continuous', 'binary', or 'multinominal'.
$focal_time The time in the past at which the trait and rates data were tested.
If using continuous or binary data:
$two-tailed Logical. Record the type of test used: two-tailed if TRUE, one-tailed if FALSE.
If one_tailed_hypothesis is provided (only for continuous and binary trait data):
$one_tailed_hypothesis Character string. Record of the alternative hypothesis used for the one-tailed tests.
If posthoc_pairwise_tests = TRUE (only for multinomial trait data):
$posthoc_pairwise_tests List of at least 3 sub-elements:
$summary_df Data.frame of five variables providing the summary results of post hoc pairwise tests
$method Character string. The statistical method used to carry out the test. Here, "Dunn".
$two-tailed Logical. Record the type of post hoc pairwise tests used: two-tailed if TRUE, one-tailed if FALSE.
If return_perm_data = TRUE, the stats data computed from the posterior samples for observed and permuted data are provided.
This is needed to plot the histogram of the null distribution used to assess significance of the test with plot_histogram_STRAPP_test_for_focal_time().
$perm_data_df A data.frame with four variables summarizing the data generated during the STRAPP test:
$posterior_samples_random_ID Integer. ID of the posterior samples randomly drawn and used for the STRAPP test.
$*_obs Numerical. Test stats computed from the observed data in the posterior samples. Name depends on the test used.
$*_perm Numerical. Test stats computed from the permuted data in the posterior samples. Name depends on the test used.
$delta_* OR $abs_delta_* Numerical. Test stats computed for the STRAPP test comparing observed stats and permuted stats.
Name depends on the test used and the type of tests (two-tailed compare absolute values; one-tailed compare raw values).
Combined with posthoc_pairwise_tests = TRUE, the stats data are also provided for the post hoc pairwise tests:
$posthoc_pairwise_tests$perm_data_array A 3D array containing stats data for all post hoc pairwise tests in a similar format that $perm_data_df.
If no STRAPP test was performed in the case of categorical/biogeographic data with a single state/range at focal_time,
only the $trait_data_type, $trait_data_type_for_stats = "none", and $focal_time are returned.
Maël Doré
For STRAPP: Rabosky, D. L., & Huang, H. (2016). A robust semi-parametric test for detecting trait-dependent diversification. Systematic biology, 65(2), 181-193. doi:10.1093/sysbio/syv066.
For STRAPP in deep times: Doré, M., Borowiec, M. L., Branstetter, M. G., Camacho, G. P., Fisher, B. L., Longino, J. T., Ward, P. S., Blaimer, B. B., (2025), Evolutionary history of ponerine ants highlights how the timing of dispersal events shapes modern biodiversity, Nature Communications. doi:10.1038/s41467-025-63709-3
Associated functions in deepSTRAPP: extract_most_likely_trait_values_for_focal_time() update_rates_and_regimes_for_focal_time()
Original function in BAMMtools: BAMMtools::traitDependentBAMM()
Statistical tests: stats::cor.test() stats::wilcox.test() stats::kruskal.test() dunn.test::dunn.test()
For a guided tutorial, see this vignette: vignette("explore_STRAPP_test_types", package = "deepSTRAPP")
if (deepSTRAPP::is_dev_version()) { # ------ Prepare data ------ # ## Load the BAMM_object summarizing 1000 posterior samples of BAMM with diversification rates # for ponerine ants extracted for 10My ago. data(Ponerinae_BAMM_object_10My, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Plot the associated phylogeny with mapped rates plot_BAMM_rates(Ponerinae_BAMM_object_10My) ## Load the object containing head width trait data for ponerine ants extracted for 10My ago. data(Ponerinae_trait_cont_tip_data_10My, package = "deepSTRAPP") # Plot the associated contMap (continuous trait stochastic map) plot_contMap(Ponerinae_trait_cont_tip_data_10My$contMap) # Check that objects are ordered in the same fashion identical(names(Ponerinae_BAMM_object_10My$tipStates[[1]]), names(Ponerinae_trait_cont_tip_data_10My$trait_data)) # Save continuous data trait_data_continuous <- Ponerinae_trait_cont_tip_data_10My ## Transform trait data into binary and multinominal data # Binarize data into two states trait_data_binary <- trait_data_continuous trait_data_binary$trait_data[trait_data_continuous$trait_data < 0] <- "state_A" trait_data_binary$trait_data[trait_data_continuous$trait_data >= 0] <- "state_B" trait_data_binary$trait_data_type <- "categorical" table(trait_data_binary$trait_data) # Categorize data into three states trait_data_multinominal <- trait_data_continuous trait_data_multinominal$trait_data[trait_data_continuous$trait_data < 0] <- "state_B" trait_data_multinominal$trait_data[trait_data_continuous$trait_data < -1] <- "state_A" trait_data_multinominal$trait_data[trait_data_continuous$trait_data >= 0] <- "state_C" trait_data_multinominal$trait_data_type <- "categorical" table(trait_data_multinominal$trait_data) # (May take several minutes to run) # ------ Compute STRAPP test for continuous data ------ # plot(x = trait_data_continuous$trait_data, y = Ponerinae_BAMM_object_10My$tipLambda[[1]]) # Compute STRAPP test under the alternative hypothesis of a "negative" correlation # between "net_diversification" rates and trait data STRAPP_results <- compute_STRAPP_test_for_focal_time( BAMM_object = Ponerinae_BAMM_object_10My, trait_data_list = trait_data_continuous, two_tailed = FALSE, one_tailed_hypothesis = "negative", return_perm_data = TRUE) str(STRAPP_results, max.level = 2) # Data from the posterior samples is available in STRAPP_results$perm_data_df head(STRAPP_results$perm_data_df) # ------ Compute STRAPP test for binary data ------ # # Compute STRAPP test under the alternative hypothesis that "state_A" is associated # with higher "net_diversification" that "state_B" STRAPP_results <- compute_STRAPP_test_for_focal_time( BAMM_object = Ponerinae_BAMM_object_10My, trait_data_list = trait_data_binary, two_tailed = FALSE, one_tailed_hypothesis = c("state_A > state_B")) str(STRAPP_results, max.level = 1) # Compute STRAPP test under the alternative hypothesis that "state_B" is associated # with higher "net_diversification" that "state_A" STRAPP_results <- compute_STRAPP_test_for_focal_time(BAMM_object = Ponerinae_BAMM_object_10My, trait_data_list = trait_data_binary, two_tailed = FALSE, one_tailed_hypothesis = c("state_B > state_A")) str(STRAPP_results, max.level = 1) # ------ Compute STRAPP test for multinominal data ------ # # Compute STRAPP test between all three states, and compute post hoc tests # for differences in rates between all possible pairs of states # with a p-value adjusted for multiple comparison using Bonferroni's correction STRAPP_results <- compute_STRAPP_test_for_focal_time( BAMM_object = Ponerinae_BAMM_object_10My, trait_data_list = trait_data_multinominal, posthoc_pairwise_tests = TRUE, two_tailed = TRUE, p.adjust_method = "bonferroni") str(STRAPP_results, max.level = 3) # All post hoc pairwise test summaries are available in $summary_df STRAPP_results$posthoc_pairwise_tests$summary_df }if (deepSTRAPP::is_dev_version()) { # ------ Prepare data ------ # ## Load the BAMM_object summarizing 1000 posterior samples of BAMM with diversification rates # for ponerine ants extracted for 10My ago. data(Ponerinae_BAMM_object_10My, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Plot the associated phylogeny with mapped rates plot_BAMM_rates(Ponerinae_BAMM_object_10My) ## Load the object containing head width trait data for ponerine ants extracted for 10My ago. data(Ponerinae_trait_cont_tip_data_10My, package = "deepSTRAPP") # Plot the associated contMap (continuous trait stochastic map) plot_contMap(Ponerinae_trait_cont_tip_data_10My$contMap) # Check that objects are ordered in the same fashion identical(names(Ponerinae_BAMM_object_10My$tipStates[[1]]), names(Ponerinae_trait_cont_tip_data_10My$trait_data)) # Save continuous data trait_data_continuous <- Ponerinae_trait_cont_tip_data_10My ## Transform trait data into binary and multinominal data # Binarize data into two states trait_data_binary <- trait_data_continuous trait_data_binary$trait_data[trait_data_continuous$trait_data < 0] <- "state_A" trait_data_binary$trait_data[trait_data_continuous$trait_data >= 0] <- "state_B" trait_data_binary$trait_data_type <- "categorical" table(trait_data_binary$trait_data) # Categorize data into three states trait_data_multinominal <- trait_data_continuous trait_data_multinominal$trait_data[trait_data_continuous$trait_data < 0] <- "state_B" trait_data_multinominal$trait_data[trait_data_continuous$trait_data < -1] <- "state_A" trait_data_multinominal$trait_data[trait_data_continuous$trait_data >= 0] <- "state_C" trait_data_multinominal$trait_data_type <- "categorical" table(trait_data_multinominal$trait_data) # (May take several minutes to run) # ------ Compute STRAPP test for continuous data ------ # plot(x = trait_data_continuous$trait_data, y = Ponerinae_BAMM_object_10My$tipLambda[[1]]) # Compute STRAPP test under the alternative hypothesis of a "negative" correlation # between "net_diversification" rates and trait data STRAPP_results <- compute_STRAPP_test_for_focal_time( BAMM_object = Ponerinae_BAMM_object_10My, trait_data_list = trait_data_continuous, two_tailed = FALSE, one_tailed_hypothesis = "negative", return_perm_data = TRUE) str(STRAPP_results, max.level = 2) # Data from the posterior samples is available in STRAPP_results$perm_data_df head(STRAPP_results$perm_data_df) # ------ Compute STRAPP test for binary data ------ # # Compute STRAPP test under the alternative hypothesis that "state_A" is associated # with higher "net_diversification" that "state_B" STRAPP_results <- compute_STRAPP_test_for_focal_time( BAMM_object = Ponerinae_BAMM_object_10My, trait_data_list = trait_data_binary, two_tailed = FALSE, one_tailed_hypothesis = c("state_A > state_B")) str(STRAPP_results, max.level = 1) # Compute STRAPP test under the alternative hypothesis that "state_B" is associated # with higher "net_diversification" that "state_A" STRAPP_results <- compute_STRAPP_test_for_focal_time(BAMM_object = Ponerinae_BAMM_object_10My, trait_data_list = trait_data_binary, two_tailed = FALSE, one_tailed_hypothesis = c("state_B > state_A")) str(STRAPP_results, max.level = 1) # ------ Compute STRAPP test for multinominal data ------ # # Compute STRAPP test between all three states, and compute post hoc tests # for differences in rates between all possible pairs of states # with a p-value adjusted for multiple comparison using Bonferroni's correction STRAPP_results <- compute_STRAPP_test_for_focal_time( BAMM_object = Ponerinae_BAMM_object_10My, trait_data_list = trait_data_multinominal, posthoc_pairwise_tests = TRUE, two_tailed = TRUE, p.adjust_method = "bonferroni") str(STRAPP_results, max.level = 3) # All post hoc pairwise test summaries are available in $summary_df STRAPP_results$posthoc_pairwise_tests$summary_df }
Cuts off all the branches of the phylogeny which are
younger than a specific time in the past (i.e. the focal_time).
Branches overlapping the focal_time are shorten to the focal_time.
Likewise, remove continuous trait mapping for the cut off branches
by updating the $tree$maps and $tree$mapped.edge elements.
cut_contMap_for_focal_time(contMap, focal_time, keep_tip_labels = TRUE)cut_contMap_for_focal_time(contMap, focal_time, keep_tip_labels = TRUE)
contMap |
Object of class |
focal_time |
Numerical. The time, in terms of time distance from the present, for which the tree and mapping must be cut. It must be smaller than the root age of the phylogeny. |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip must retained their initial |
The phylogenetic tree is cut for a specific time in the past (i.e. the focal_time).
When a branch with a single descendant tip is cut and keep_tip_labels = TRUE,
the leaf left is labeled with the tip.label of the unique descendant tip.
When a branch with a single descendant tip is cut and keep_tip_labels = FALSE,
the leaf left is labeled with the node ID of the unique descendant tip.
In all cases, when a branch with multiple descendant tips (i.e., a clade) is cut, the leaf left is labeled with the node ID of the MRCA of the cut-off clade.
The continuous trait mapping is updated accordingly by removing mapping associated with the cut off branches.
The function returns the cut contMap as an object of class "contMap".
It contains a $tree element of classes "simmap" and "phylo". This function updates and adds multiple useful sub-elements to the $tree element.
$maps An updated list of named numerical vectors. Provides the mapping of trait values along each remaining edge.
$mapped.edge An updated matrix. Provides the evolutionary time spent across trait values (columns) along the remaining edges (rows).
$root_age Integer. Stores the age of the root of the tree.
$nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
$initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
$edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
$initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
Maël Doré
cut_phylo_for_focal_time() extract_most_likely_trait_values_for_focal_time()
extract_most_likely_trait_values_from_contMap_for_focal_time()
For a guided tutorial, see this vignette: vignette("cut_phylogenies", package = "deepSTRAPP")
# ----- Prepare data ----- # # Load mammals phylogeny and data from the R package motmot (data included in deepSTRAPP) # Initial data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data(mammals, package = "deepSTRAPP") mammals_tree <- mammals$mammal.phy mammals_data <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] # Run a stochastic mapping based on a Brownian Motion model # for Ancestral Trait Estimates to obtain a "contMap" object mammals_contMap <- phytools::contMap(mammals_tree, x = mammals_data, res = 100, # Number of time steps plot = FALSE) # Set focal time focal_time <- 80 # ----- Example 1: keep_tip_labels = TRUE ----- # # Cut contMap to 80 Mya while keeping tip.label # on terminal branches with a unique descending tip. updated_contMap <- cut_contMap_for_focal_time(contMap = mammals_contMap, focal_time = focal_time, keep_tip_labels = TRUE) # Plot node labels on initial stochastic map with cut-off plot_contMap(mammals_contMap, lwd = 2, fsize = c(0.5, 1)) ape::nodelabels(cex = 0.5) abline(v = max(phytools::nodeHeights(mammals_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot initial node labels on cut stochastic map plot_contMap(updated_contMap, fsize = c(0.8, 1)) ape::nodelabels(cex = 0.8, text = updated_contMap$tree$initial_nodes_ID) # ----- Example 2: keep_tip_labels = FALSE ----- # # Cut contMap to 80 Mya while NOT keeping tip.label. updated_contMap <- cut_contMap_for_focal_time(contMap = mammals_contMap, focal_time = focal_time, keep_tip_labels = FALSE) # Plot node labels on initial stochastic map with cut-off plot_contMap(mammals_contMap, fsize = c(0.5, 1)) ape::nodelabels(cex = 0.5) abline(v = max(phytools::nodeHeights(mammals_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot initial node labels on cut stochastic map plot_contMap(updated_contMap, fsize = c(0.8, 1)) ape::nodelabels(cex = 0.8, text = updated_contMap$tree$initial_nodes_ID)# ----- Prepare data ----- # # Load mammals phylogeny and data from the R package motmot (data included in deepSTRAPP) # Initial data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data(mammals, package = "deepSTRAPP") mammals_tree <- mammals$mammal.phy mammals_data <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] # Run a stochastic mapping based on a Brownian Motion model # for Ancestral Trait Estimates to obtain a "contMap" object mammals_contMap <- phytools::contMap(mammals_tree, x = mammals_data, res = 100, # Number of time steps plot = FALSE) # Set focal time focal_time <- 80 # ----- Example 1: keep_tip_labels = TRUE ----- # # Cut contMap to 80 Mya while keeping tip.label # on terminal branches with a unique descending tip. updated_contMap <- cut_contMap_for_focal_time(contMap = mammals_contMap, focal_time = focal_time, keep_tip_labels = TRUE) # Plot node labels on initial stochastic map with cut-off plot_contMap(mammals_contMap, lwd = 2, fsize = c(0.5, 1)) ape::nodelabels(cex = 0.5) abline(v = max(phytools::nodeHeights(mammals_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot initial node labels on cut stochastic map plot_contMap(updated_contMap, fsize = c(0.8, 1)) ape::nodelabels(cex = 0.8, text = updated_contMap$tree$initial_nodes_ID) # ----- Example 2: keep_tip_labels = FALSE ----- # # Cut contMap to 80 Mya while NOT keeping tip.label. updated_contMap <- cut_contMap_for_focal_time(contMap = mammals_contMap, focal_time = focal_time, keep_tip_labels = FALSE) # Plot node labels on initial stochastic map with cut-off plot_contMap(mammals_contMap, fsize = c(0.5, 1)) ape::nodelabels(cex = 0.5) abline(v = max(phytools::nodeHeights(mammals_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot initial node labels on cut stochastic map plot_contMap(updated_contMap, fsize = c(0.8, 1)) ape::nodelabels(cex = 0.8, text = updated_contMap$tree$initial_nodes_ID)
Cuts off all the branches of the phylogeny which are
younger than a specific time in the past (i.e. the focal_time).
Branches overlapping the focal_time are shorten to the focal_time.
Likewise, remove posterior probability mapping of the categorical trait for the cut off branches
by updating the $tree$maps and $tree$mapped.edge elements.
cut_densityMap_for_focal_time(densityMap, focal_time, keep_tip_labels = TRUE)cut_densityMap_for_focal_time(densityMap, focal_time, keep_tip_labels = TRUE)
densityMap |
Object of class |
focal_time |
Numerical. The time, in terms of time distance from the present, for which the tree and mapping must be cut. It must be smaller than the root age of the phylogeny. |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip must retained their initial |
The phylogenetic tree is cut for a specific time in the past (i.e. the focal_time).
When a branch with a single descendant tip is cut and keep_tip_labels = TRUE,
the leaf left is labeled with the tip.label of the unique descendant tip.
When a branch with a single descendant tip is cut and keep_tip_labels = FALSE,
the leaf left is labeled with the node ID of the unique descendant tip.
In all cases, when a branch with multiple descendant tips (i.e., a clade) is cut, the leaf left is labeled with the node ID of the MRCA of the cut-off clade.
The posterior probability mapping of a categorical trait is updated accordingly by removing mapping associated with the cut off branches.
The function returns the cut densityMap as an object of class "densityMap" with three elements.
It contains a $tree element of classes "simmap" and "phylo". This function updates and adds multiple useful sub-elements to the $tree element.
* $maps An updated list of named numerical vectors. Provides the mapping of posterior probability of the state along each remaining edge.
* $mapped.edge An updated matrix. Provides the evolutionary time spent across posterior probabilities (columns) along the remaining edges (rows).
* $root_age Integer. Stores the age of the root of the tree.
* $nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
* $initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
* $edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
* $initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
The $col element describes the colors used to map each possible posterior probability value from 0 to 1000.
The $states element provide the name of the states. Here, the first value is the absence of the state labeled as "Not X" with X being the state.
The second value is the name of the state.
High posterior probability reflects high likelihood to harbor the state. Low probability reflects high likelihood to NOT harbor the state.
Maël Doré
cut_phylo_for_focal_time() extract_most_likely_trait_values_for_focal_time()
extract_most_likely_states_from_densityMaps_for_focal_time()
For a guided tutorial, see this vignette: vignette("cut_phylogenies", package = "deepSTRAPP")
# ----- Prepare data ----- # # Load mammals phylogeny and data from the R package motmot included within deepSTRAPP # Data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data("mammals", package = "deepSTRAPP") # Obtain mammal tree mammals_tree <- mammals$mammal.phy # Convert mass data into categories mammals_mass <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] mammals_data <- mammals_mass mammals_data[seq_along(mammals_data)] <- "small" mammals_data[mammals_mass > 5] <- "medium" mammals_data[mammals_mass > 10] <- "large" table(mammals_data) # (May take several minutes to run) # Produce densityMaps using stochastic character mapping based on an equal-rates (ER) Mk model mammals_cat_data <- prepare_trait_data(tip_data = mammals_data, phylo = mammals_tree, trait_data_type = "categorical", evolutionary_models = "ER", nb_simulations = 100, plot_map = FALSE) # Set focal time focal_time <- 80 # Extract the density map for small mammals (state 3 = "small") mammals_densityMap_small <- mammals_cat_data$densityMaps[[3]] # ----- Example 1: keep_tip_labels = TRUE ----- # # Cut densityMap to 80 Mya while keeping tip.label # on terminal branches with a unique descending tip. updated_mammals_densityMap_small <- cut_densityMap_for_focal_time( densityMap = mammals_densityMap_small, focal_time = focal_time, keep_tip_labels = TRUE) # Plot node labels on initial stochastic map with cut-off phytools::plot.densityMap(mammals_densityMap_small, fsize = 0.7, lwd = 2) ape::nodelabels(cex = 0.7) abline(v = max(phytools::nodeHeights(mammals_densityMap_small$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot initial node labels on cut stochastic map phytools::plot.densityMap(updated_mammals_densityMap_small, fsize = 0.8) ape::nodelabels(cex = 0.8, text = updated_mammals_densityMap_small$tree$initial_nodes_ID) # ----- Example 2: keep_tip_labels = FALSE ----- # # Cut densityMap to 80 Mya while NOT keeping tip.label updated_mammals_densityMap_small <- cut_densityMap_for_focal_time( densityMap = mammals_densityMap_small, focal_time = focal_time, keep_tip_labels = FALSE) # Plot node labels on initial stochastic map with cut-off phytools::plot.densityMap(mammals_densityMap_small, fsize = 0.7, lwd = 2) ape::nodelabels(cex = 0.7) abline(v = max(phytools::nodeHeights(mammals_densityMap_small$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot initial node labels on cut stochastic map phytools::plot.densityMap(updated_mammals_densityMap_small, fsize = 0.8) ape::nodelabels(cex = 0.8, text = updated_mammals_densityMap_small$tree$initial_nodes_ID)# ----- Prepare data ----- # # Load mammals phylogeny and data from the R package motmot included within deepSTRAPP # Data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data("mammals", package = "deepSTRAPP") # Obtain mammal tree mammals_tree <- mammals$mammal.phy # Convert mass data into categories mammals_mass <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] mammals_data <- mammals_mass mammals_data[seq_along(mammals_data)] <- "small" mammals_data[mammals_mass > 5] <- "medium" mammals_data[mammals_mass > 10] <- "large" table(mammals_data) # (May take several minutes to run) # Produce densityMaps using stochastic character mapping based on an equal-rates (ER) Mk model mammals_cat_data <- prepare_trait_data(tip_data = mammals_data, phylo = mammals_tree, trait_data_type = "categorical", evolutionary_models = "ER", nb_simulations = 100, plot_map = FALSE) # Set focal time focal_time <- 80 # Extract the density map for small mammals (state 3 = "small") mammals_densityMap_small <- mammals_cat_data$densityMaps[[3]] # ----- Example 1: keep_tip_labels = TRUE ----- # # Cut densityMap to 80 Mya while keeping tip.label # on terminal branches with a unique descending tip. updated_mammals_densityMap_small <- cut_densityMap_for_focal_time( densityMap = mammals_densityMap_small, focal_time = focal_time, keep_tip_labels = TRUE) # Plot node labels on initial stochastic map with cut-off phytools::plot.densityMap(mammals_densityMap_small, fsize = 0.7, lwd = 2) ape::nodelabels(cex = 0.7) abline(v = max(phytools::nodeHeights(mammals_densityMap_small$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot initial node labels on cut stochastic map phytools::plot.densityMap(updated_mammals_densityMap_small, fsize = 0.8) ape::nodelabels(cex = 0.8, text = updated_mammals_densityMap_small$tree$initial_nodes_ID) # ----- Example 2: keep_tip_labels = FALSE ----- # # Cut densityMap to 80 Mya while NOT keeping tip.label updated_mammals_densityMap_small <- cut_densityMap_for_focal_time( densityMap = mammals_densityMap_small, focal_time = focal_time, keep_tip_labels = FALSE) # Plot node labels on initial stochastic map with cut-off phytools::plot.densityMap(mammals_densityMap_small, fsize = 0.7, lwd = 2) ape::nodelabels(cex = 0.7) abline(v = max(phytools::nodeHeights(mammals_densityMap_small$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot initial node labels on cut stochastic map phytools::plot.densityMap(updated_mammals_densityMap_small, fsize = 0.8) ape::nodelabels(cex = 0.8, text = updated_mammals_densityMap_small$tree$initial_nodes_ID)
Cuts off all the branches of the phylogeny which are
younger than a specific time in the past (i.e. the focal_time).
Branches overlapping the focal_time are shorten to the focal_time.
Likewise, remove posterior probability mapping of the categorical trait for the cut off branches
by updating the $tree$maps and $tree$mapped.edge elements.
cut_densityMaps_for_focal_time(densityMaps, focal_time, keep_tip_labels = TRUE)cut_densityMaps_for_focal_time(densityMaps, focal_time, keep_tip_labels = TRUE)
densityMaps |
List of objects of class |
focal_time |
Numerical. The time, in terms of time distance from the present, for which the tree and mapping must be cut. It must be smaller than the root age of the phylogeny. |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip must retained their initial |
The phylogenetic tree is cut for a specific time in the past (i.e. the focal_time).
When a branch with a single descendant tip is cut and keep_tip_labels = TRUE,
the leaf left is labeled with the tip.label of the unique descendant tip.
When a branch with a single descendant tip is cut and keep_tip_labels = FALSE,
the leaf left is labeled with the node ID of the unique descendant tip.
In all cases, when a branch with multiple descendant tips (i.e., a clade) is cut, the leaf left is labeled with the node ID of the MRCA of the cut-off clade.
The continuous trait mapping is updated accordingly by removing mapping associated with the cut off branches.
The function returns an updated list of objects as cut densityMaps of class "densityMap".
Each densityMap object contains three elements:
$tree element of classes "simmap" and "phylo". This function updates and adds multiple useful sub-elements to the $tree element.
$maps An updated list of named numerical vectors. Provides the mapping of posterior probability of the state along each remaining edge.
$mapped.edge An updated matrix. Provides the evolutionary time spent across posterior probabilities (columns) along the remaining edges (rows).
$root_age Integer. Stores the age of the root of the tree.
$nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
$initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
$edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
$initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
$col element describes the colors used to map each possible posterior probability value from 0 to 1000.
$states element provide the name of the states. Here, the first value is the absence of the state labeled as "Not X" with X being the state.
The second value is the name of the state.
High posterior probability reflects high likelihood to harbor the state. Low probability reflects high likelihood to NOT harbor the state.
Maël Doré
cut_phylo_for_focal_time() cut_densityMap_for_focal_time()
extract_most_likely_trait_values_for_focal_time()
extract_most_likely_states_from_densityMaps_for_focal_time()
# ----- Prepare data ----- # # Load mammals phylogeny and data from the R package motmot, and implemented in deepSTRAPP # Data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data("mammals", package = "deepSTRAPP") # Obtain mammal tree mammals_tree <- mammals$mammal.phy # Convert mass data into categories mammals_mass <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] mammals_data <- mammals_mass mammals_data[seq_along(mammals_data)] <- "small" mammals_data[mammals_mass > 5] <- "medium" mammals_data[mammals_mass > 10] <- "large" table(mammals_data) # (May take several minutes to run) # Produce densityMaps using stochastic character mapping based on an equal-rates (ER) Mk model mammals_cat_data <- prepare_trait_data(tip_data = mammals_data, phylo = mammals_tree, trait_data_type = "categorical", evolutionary_models = "ER", nb_simulations = 100, plot_map = FALSE) # Set focal time focal_time <- 80 # Extract the density maps mammals_densityMaps <- mammals_cat_data$densityMaps # ----- Example 1: keep_tip_labels = TRUE ----- # # Cut densityMaps to 80 Mya while keeping tip.label # on terminal branches with a unique descending tip. updated_mammals_densityMaps <- cut_densityMaps_for_focal_time( densityMaps = mammals_densityMaps, focal_time = focal_time, keep_tip_labels = TRUE) ## Plot density maps as overlay of all state posterior probabilities # ?plot_densityMaps_overlay # Plot initial density maps plot_densityMaps_overlay(densityMaps = mammals_densityMaps, fsize = 0.5) abline(v = max(phytools::nodeHeights(mammals_densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated/cut density maps plot_densityMaps_overlay(densityMaps = updated_mammals_densityMaps, fsize = 0.8) # ----- Example 2: keep_tip_labels = FALSE ----- # # Cut densityMap to 80 Mya while NOT keeping tip.label updated_mammals_densityMaps <- cut_densityMaps_for_focal_time( densityMaps = mammals_densityMaps, focal_time = focal_time, keep_tip_labels = FALSE) # Plot initial density maps plot_densityMaps_overlay(densityMaps = mammals_densityMaps, fsize = 0.5) abline(v = max(phytools::nodeHeights(mammals_densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated/cut density maps plot_densityMaps_overlay(densityMaps = updated_mammals_densityMaps, fsize = 0.8)# ----- Prepare data ----- # # Load mammals phylogeny and data from the R package motmot, and implemented in deepSTRAPP # Data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data("mammals", package = "deepSTRAPP") # Obtain mammal tree mammals_tree <- mammals$mammal.phy # Convert mass data into categories mammals_mass <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] mammals_data <- mammals_mass mammals_data[seq_along(mammals_data)] <- "small" mammals_data[mammals_mass > 5] <- "medium" mammals_data[mammals_mass > 10] <- "large" table(mammals_data) # (May take several minutes to run) # Produce densityMaps using stochastic character mapping based on an equal-rates (ER) Mk model mammals_cat_data <- prepare_trait_data(tip_data = mammals_data, phylo = mammals_tree, trait_data_type = "categorical", evolutionary_models = "ER", nb_simulations = 100, plot_map = FALSE) # Set focal time focal_time <- 80 # Extract the density maps mammals_densityMaps <- mammals_cat_data$densityMaps # ----- Example 1: keep_tip_labels = TRUE ----- # # Cut densityMaps to 80 Mya while keeping tip.label # on terminal branches with a unique descending tip. updated_mammals_densityMaps <- cut_densityMaps_for_focal_time( densityMaps = mammals_densityMaps, focal_time = focal_time, keep_tip_labels = TRUE) ## Plot density maps as overlay of all state posterior probabilities # ?plot_densityMaps_overlay # Plot initial density maps plot_densityMaps_overlay(densityMaps = mammals_densityMaps, fsize = 0.5) abline(v = max(phytools::nodeHeights(mammals_densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated/cut density maps plot_densityMaps_overlay(densityMaps = updated_mammals_densityMaps, fsize = 0.8) # ----- Example 2: keep_tip_labels = FALSE ----- # # Cut densityMap to 80 Mya while NOT keeping tip.label updated_mammals_densityMaps <- cut_densityMaps_for_focal_time( densityMaps = mammals_densityMaps, focal_time = focal_time, keep_tip_labels = FALSE) # Plot initial density maps plot_densityMaps_overlay(densityMaps = mammals_densityMaps, fsize = 0.5) abline(v = max(phytools::nodeHeights(mammals_densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated/cut density maps plot_densityMaps_overlay(densityMaps = updated_mammals_densityMaps, fsize = 0.8)
Cuts off all the branches of the phylogeny which are
younger than a specific time in the past (i.e. the focal_time).
Branches overlapping the focal_time are shorten to the focal_time.
cut_phylo_for_focal_time(tree, focal_time, keep_tip_labels = TRUE)cut_phylo_for_focal_time(tree, focal_time, keep_tip_labels = TRUE)
tree |
Object of class |
focal_time |
Numerical. The time, in terms of time distance from the present, for which the tree must be cut. It must be smaller than the root age of the phylogeny. |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip must retained their initial |
When a branch with a single descendant tip is cut and keep_tip_labels = TRUE,
the leaf left is labeled with the tip.label of the unique descendant tip.
When a branch with a single descendant tip is cut and keep_tip_labels = FALSE,
the leaf left is labeled with the node ID of the unique descendant tip.
In all cases, when a branch with multiple descendant tips (i.e., a clade) is cut, the leaf left is labeled with the node ID of the MRCA of the cut-off clade.
The function returns the cut phylogeny as an object of class "phylo". It adds multiple useful elements to the object.
$root_age Integer. Stores the age of the root of the tree.
$nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
$initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
$edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
$initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
Maël Doré
cut_contMap_for_focal_time() cut_densityMaps_for_focal_time()
For a guided tutorial, see this vignette: vignette("cut_phylogenies", package = "deepSTRAPP")
# Load eel phylogeny from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 data("eel.tree", package = "phytools") # ----- Example 1: keep_tip_labels = TRUE ----- # # Cut tree to 30 Mya while keeping tip.label on terminal branches with a unique descending tip. cut_eel.tree <- cut_phylo_for_focal_time(tree = eel.tree, focal_time = 30, keep_tip_labels = TRUE) # Plot internal node labels on initial tree with cut-off plot(eel.tree, cex = 0.5) abline(v = max(phytools::nodeHeights(eel.tree)[,2]) - 30, col = "red", lty = 2, lwd = 2) nb_tips <- length(eel.tree$tip.label) nodelabels_in_cut_tree <- (nb_tips + 1):(nb_tips + eel.tree$Nnode) nodelabels_in_cut_tree[!(nodelabels_in_cut_tree %in% cut_eel.tree$initial_nodes_ID)] <- NA ape::nodelabels(text = nodelabels_in_cut_tree) # Plot initial internal node labels on cut tree plot(cut_eel.tree, cex = 0.8) ape::nodelabels(text = cut_eel.tree$initial_nodes_ID) # Plot edge labels on initial tree with cut-off plot(eel.tree, cex = 0.5) abline(v = max(phytools::nodeHeights(eel.tree)[,2]) - 30, col = "red", lty = 2, lwd = 2) edgelabels_in_cut_tree <- 1:nrow(eel.tree$edge) edgelabels_in_cut_tree[!(1:nrow(eel.tree$edge) %in% cut_eel.tree$initial_edges_ID)] <- NA ape::edgelabels(text = edgelabels_in_cut_tree) # Plot initial edge labels on cut tree plot(cut_eel.tree, cex = 0.8) ape::edgelabels(text = cut_eel.tree$initial_edges_ID) # ----- Example 2: keep_tip_labels = FALSE ----- # # Cut tree to 30 Mya without keeping tip.label on terminal branches with a unique descending tip. # All tip.labels are converted to their descending/tipward node ID cut_eel.tree <- cut_phylo_for_focal_time(tree = eel.tree, focal_time = 30, keep_tip_labels = FALSE) plot(cut_eel.tree, cex = 0.8)# Load eel phylogeny from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 data("eel.tree", package = "phytools") # ----- Example 1: keep_tip_labels = TRUE ----- # # Cut tree to 30 Mya while keeping tip.label on terminal branches with a unique descending tip. cut_eel.tree <- cut_phylo_for_focal_time(tree = eel.tree, focal_time = 30, keep_tip_labels = TRUE) # Plot internal node labels on initial tree with cut-off plot(eel.tree, cex = 0.5) abline(v = max(phytools::nodeHeights(eel.tree)[,2]) - 30, col = "red", lty = 2, lwd = 2) nb_tips <- length(eel.tree$tip.label) nodelabels_in_cut_tree <- (nb_tips + 1):(nb_tips + eel.tree$Nnode) nodelabels_in_cut_tree[!(nodelabels_in_cut_tree %in% cut_eel.tree$initial_nodes_ID)] <- NA ape::nodelabels(text = nodelabels_in_cut_tree) # Plot initial internal node labels on cut tree plot(cut_eel.tree, cex = 0.8) ape::nodelabels(text = cut_eel.tree$initial_nodes_ID) # Plot edge labels on initial tree with cut-off plot(eel.tree, cex = 0.5) abline(v = max(phytools::nodeHeights(eel.tree)[,2]) - 30, col = "red", lty = 2, lwd = 2) edgelabels_in_cut_tree <- 1:nrow(eel.tree$edge) edgelabels_in_cut_tree[!(1:nrow(eel.tree$edge) %in% cut_eel.tree$initial_edges_ID)] <- NA ape::edgelabels(text = edgelabels_in_cut_tree) # Plot initial edge labels on cut tree plot(cut_eel.tree, cex = 0.8) ape::edgelabels(text = cut_eel.tree$initial_edges_ID) # ----- Example 2: keep_tip_labels = FALSE ----- # # Cut tree to 30 Mya without keeping tip.label on terminal branches with a unique descending tip. # All tip.labels are converted to their descending/tipward node ID cut_eel.tree <- cut_phylo_for_focal_time(tree = eel.tree, focal_time = 30, keep_tip_labels = FALSE) plot(cut_eel.tree, cex = 0.8)
A list containing (fake) geographic ranges data of eels mapped on the phylogeny,
modeled with R package BioGeoBEARS. This object was obtained with prepare_trait_data().
Initial data based on feeding habits was altered to be transformed into range "A" and "B", and then adding arbitrarily multi-area "AB" ranges.
This is NOT real biogeographic data. Please refer to the initial article for real data.
Original data source: Collar, D. C., P. C. Wainwright, M. E. Alfaro, L. J. Revell, and R. S. Mehta (2014) Biting disrupts integration to spur skull evolution in eels. Nature Communications, 5, 5505. doi:10.1038/ncomms6505
data(eel_biogeo_data)data(eel_biogeo_data)
A list with 9 elements.
A list of 9 elements containing information on the evolution of geographic ranges in eels.
This object was obtained with prepare_trait_data().
$densityMaps List of objects of class "densityMap that contains a phylogenetic tree and associated mapping of probability
to harbor a given range along branches. The list contains only a "densityMap per unique areas because split_multi_area_ranges was set to TRUE.
$densityMaps_all_ranges List of objects of class "densityMap that contains a phylogenetic tree and associated mapping of probability
to harbor a given range along branches. The list contains one "densityMap per range found along branches during the simulated biogeographic histories.
$trait_data_type Character string. Record the type of trait data. Here: "biogeographic".
$ace Numerical matrix that record the posterior probabilities of ancestral ranges estimated at internal nodes.
Only unique areas are considered among the ranges. Multi-area ranges have been split among unique ranges.
Rows are internal nodes. Columns are ranges. Values are posterior probabilities of each range per node.
$ace_all_ranges Numerical matrix that record the posterior probabilities of ancestral ranges estimated at internal nodes.
All ranges observed along branches during the simulated biogeographic histories are present.
Rows are internal nodes. Columns are ranges. Values are posterior probabilities of each range per node.
$BSM_output List of two lists that contains summary information of cladogenetic ($RES_caldo_events_tables) and anagenetic ($RES_ana_events_tables) events
recording across the 1000 simulations of biogeographic histories performed during Biogeographic Stochastic Mapping (BSM).
Each element of the list is a data.frame recording events occurring during one simulation.
$simmaps List of 1000 objects of class "simmap".
Each simmap object is a phylogeny with one simulated biogeographic history (i.e., transitions in geographic ranges) mapped along branches.
$best_model_fit List that provides the output of the best fitting model (Here: DEC+J model).
$model_selection_df Data.frame that summarizes model comparisons used to select the best fitting model.
Collar, D. C., P. C. Wainwright, M. E. Alfaro, L. J. Revell, and R. S. Mehta (2014) Biting disrupts integration to spur skull evolution in eels. Nature Communications, 5, 5505. doi:10.1038/ncomms6505
A list containing feeding habits data of eels mapped on the phylogeny,
modeled with geiger::fitDiscrete. This object was obtained with prepare_trait_data().
Initial data was altered arbitrarily to create three categories, adding a "kiss" feeding habit to the initial
"bite" and "suction" data. This is NOT real biological data. Please refer to the initial article for real data.
Original data source: Collar, D. C., P. C. Wainwright, M. E. Alfaro, L. J. Revell, and R. S. Mehta (2014) Biting disrupts integration to spur skull evolution in eels. Nature Communications, 5, 5505. doi:10.1038/ncomms6505
data(eel_cat_3lvl_data)data(eel_cat_3lvl_data)
A list with 6 elements.
A list of five objects containing information on the evolution of feeding habits in eels.
This object was obtained with prepare_trait_data().
$densityMaps List of objects of class "densityMap that contains a phylogenetic tree and associated mapping of probability
to harbor a given state/range along branches. The list contains one "densityMap per state/range found in the tip_data.
$trait_data_type Character string. Record the type of trait data. Here: "categorical".
$simmaps List of 100 stochastic mapping simulations for trait evolution. Each element is a "simmap" object (see phytools::make.simmap)
representing a possible evolutionary history that fits states observed on tips, infered ACE at internal nodes, and transition rates as estimated from the best fit model.
$ace Numerical matrix that record the posterior probabilities of ancestral states/ranges (characters) estimates (ACE) at internal nodes.
Rows are internal nodes. Columns are states/ranges. Values are posterior probabilities of each state per node.
$best_model_fit List that provides the output of the best fitting model (Here: ER model).
$model_selection_df Data.frame that summarizes model comparisons used to select the best fitting model.
Collar, D. C., P. C. Wainwright, M. E. Alfaro, L. J. Revell, and R. S. Mehta (2014) Biting disrupts integration to spur skull evolution in eels. Nature Communications, 5, 5505. doi:10.1038/ncomms6505
Extracts regimes ID and tip rates from a BAMM_object that have been
updated to provide diversification data for a specific time in the past (i.e. the focal_time).
Use update_rates_and_regimes_for_focal_time() to obtain
a BAMM_object updated for a given focal_time.
extract_diversification_data_melted_df_for_focal_time( BAMM_object, verbose = TRUE )extract_diversification_data_melted_df_for_focal_time( BAMM_object, verbose = TRUE )
BAMM_object |
Object of class |
verbose |
Logical. Should progression be displayed? A message will be printed for every batch of
100 BAMM posterior samples extracted. Default is |
Returns a data.frame with six columns.
$focal_time Integer. The time, in terms of time distance from the present, at which the trait data were extracted. Should be equal for all rows
as a unique BAMM_object updated for a unique focal_time is being extracted.
$BAMM_sample_ID Integer. ID of the posterior samples from which the diversification data are extracted.
$tip_ID Character string. Tip labels of the branches cut-off at focal_time.
If keep_tip_labels = TRUE was used in update_rates_and_regimes_for_focal_time(),
cut-off branches with a single descendant tip retain their initial tip.label.
If keep_tip_labels = FALSE was used in update_rates_and_regimes_for_focal_time(),
all cut-off branches are labeled using their tipward node ID.
$regime_ID Integer. The regime ID on tips at focal_time. IDs are integer. The root process equals "1", then they are incremented from oldest to youngest.
Regime IDs are independent across posterior samples.
$rate_type Character string. Type of rates: "lambda" for speciation rates, "mu" for extinction rates,
and "net_diversification" for net diversification rates (lambda - mu).
$rates Numerical. Rates in \[number of events / branch / evolutionary time\].
Maël Doré
if (deepSTRAPP::is_dev_version()) { # Load the BAMM_object summarizing 1000 posterior samples of BAMM updated for a focal_time of 10 My data(Ponerinae_BAMM_object_10My) ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # (May take several minutes to run) # Extract diversification data diversification_data_df <- extract_diversification_data_melted_df_for_focal_time( BAMM_object = Ponerinae_BAMM_object_10My, verbose = TRUE) # Print output head(diversification_data_df) }if (deepSTRAPP::is_dev_version()) { # Load the BAMM_object summarizing 1000 posterior samples of BAMM updated for a focal_time of 10 My data(Ponerinae_BAMM_object_10My) ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # (May take several minutes to run) # Extract diversification data diversification_data_df <- extract_diversification_data_melted_df_for_focal_time( BAMM_object = Ponerinae_BAMM_object_10My, verbose = TRUE) # Print output head(diversification_data_df) }
Extracts the most likely ranges found along branches
at a specific time in the past (i.e. the focal_time).
Optionally, the function can update the mapped phylogeny (densityMaps)
such as branches overlapping the focal_time are shorten to the focal_time,
and the range mapping for the cut off branches are removed
by updating the $tree$maps and $tree$mapped.edge elements.
extract_most_likely_ranges_from_densityMaps_for_focal_time( densityMaps, ace = NULL, tip_data = NULL, focal_time, update_densityMaps = FALSE, keep_tip_labels = TRUE )extract_most_likely_ranges_from_densityMaps_for_focal_time( densityMaps, ace = NULL, tip_data = NULL, focal_time, update_densityMaps = FALSE, keep_tip_labels = TRUE )
densityMaps |
List of objects of class |
ace |
(Optional) Numerical matrix that record the posterior probabilities of ancestral ranges at internal nodes,
obtained with |
tip_data |
(Optional) Named character string vector of tip ranges. Names are nodes_ID of the internal nodes. Needed to provide accurate tip values. |
focal_time |
Integer. The time, in terms of time distance from the present, at which the tree and mapping must be cut. It must be smaller than the root age of the phylogeny. |
update_densityMaps |
Logical. Specify whether the mapped phylogeny ( |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip
must retained their initial |
The mapped phylogeny (densityMaps) is cut at a specific time in the past
(i.e. the focal_time) and the current trait values of the overlapping edges/branches are extracted.
—– Extract trait_data —–
Most likely ranges are extracted from the posterior probabilities displayed in the densityMaps.
The range with the highest probability is assigned to each tip and cut branches at focal_time.
True ML estimates will be used if tip_data and/or ace are provided as optional inputs.
In practice the discrepancy is negligible.
—– Update the densityMaps —–
To obtain updated densityMaps alongside the trait data, set update_densityMaps = TRUE.
The update consists in cutting off branches and mapping that are younger than the focal_time.
When a branch with a single descendant tip is cut and keep_tip_labels = TRUE,
the leaf left is labeled with the tip.label of the unique descendant tip.
When a branch with a single descendant tip is cut and keep_tip_labels = FALSE,
the leaf left is labeled with the node ID of the unique descendant tip.
In all cases, when a branch with multiple descendant tips (i.e., a clade) is cut, the leaf left is labeled with the node ID of the MRCA of the cut-off clade.
The ancestral range mapping in densityMap ($tree$maps and $tree$mapped.edge) is updated accordingly by removing mapping associated with the cut off branches.
By default, the function returns a list with three elements.
$trait_data A named character string vector with ML ranges found along branches overlapping the focal_time. Names are the tip.label/tipward node ID.
$focal_time Integer. The time, in terms of time distance from the present, at which the trait data were extracted.
$trait_data_type Character string. Define the type of trait data as "biogeographic". Used in downstream analyses to select appropriate statistical processing.
If update_densityMaps = TRUE, the output is a list with four elements: $trait_data, $focal_time, $trait_data_type, and $densityMaps.
$densityMaps A list of objects of class "densityMap" that contains the updated densityMap of each range/range,
with branches and mapping that are younger than the focal_time cut off.
The function also adds multiple useful sub-elements to the $densityMaps$tree elements.
$root_age Integer. Stores the age of the root of the tree.
$nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
$initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
$edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
$initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
Maël Doré
cut_phylo_for_focal_time() cut_densityMaps_for_focal_time()
Associated main function: extract_most_likely_trait_values_for_focal_time()
Sub-functions for other types of trait data:
extract_most_likely_trait_values_from_contMap_for_focal_time()
extract_most_likely_states_from_densityMaps_for_focal_time()
## Load biogeographic range data mapped on a phylogeny data(eel_biogeo_data, package = "deepSTRAPP") # Explore data str(eel_biogeo_data, 1) eel_biogeo_data$densityMaps # Two density maps: one per unique area: A, B. eel_biogeo_data$densityMaps_all_ranges # Three density maps: one per range: A, B, and AB. # Set focal time to 10 Mya focal_time <- 10 # ----- Example 1: Using only unique areas ----- # # (May take several minutes to run) ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps eel_biogeo_data_10My <- extract_most_likely_ranges_from_densityMaps_for_focal_time( densityMaps = eel_biogeo_data$densityMaps, # ace = eel_biogeo_data$ace, focal_time = focal_time, update_densityMaps = TRUE) ## Print trait data str(eel_biogeo_data_10My, 1) eel_biogeo_data_10My$trait_data ## Plot density maps as overlay of all range posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = eel_biogeo_data$densityMaps, fsize = 0.7) abline(v = max(phytools::nodeHeights(eel_biogeo_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_biogeo_data_10My$densityMaps, fsize = 0.7) # ----- Example 2: Using all ranges ----- # # (May take several minutes to run) ## Extract trait data and update densityMaps_all_ranges for the given focal_time # Extract from the densityMaps eel_biogeo_data_10My <- extract_most_likely_ranges_from_densityMaps_for_focal_time( densityMaps = eel_biogeo_data$densityMaps_all_ranges, # ace = eel_biogeo_data$ace_all_ranges, focal_time = focal_time, update_densityMaps = TRUE) ## Print trait data str(eel_biogeo_data_10My, 1) eel_biogeo_data_10My$trait_data ## Plot density maps as overlay of all range posterior probabilities # Plot initial density maps with ACE pies root_age <- max(phytools::nodeHeights(eel_biogeo_data$densityMaps_all_ranges[[1]]$tree)[,2]) plot_densityMaps_overlay(densityMaps = eel_biogeo_data$densityMaps_all_ranges, fsize = 0.7) abline(v = root_age - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_biogeo_data_10My$densityMaps, fsize = 0.7)## Load biogeographic range data mapped on a phylogeny data(eel_biogeo_data, package = "deepSTRAPP") # Explore data str(eel_biogeo_data, 1) eel_biogeo_data$densityMaps # Two density maps: one per unique area: A, B. eel_biogeo_data$densityMaps_all_ranges # Three density maps: one per range: A, B, and AB. # Set focal time to 10 Mya focal_time <- 10 # ----- Example 1: Using only unique areas ----- # # (May take several minutes to run) ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps eel_biogeo_data_10My <- extract_most_likely_ranges_from_densityMaps_for_focal_time( densityMaps = eel_biogeo_data$densityMaps, # ace = eel_biogeo_data$ace, focal_time = focal_time, update_densityMaps = TRUE) ## Print trait data str(eel_biogeo_data_10My, 1) eel_biogeo_data_10My$trait_data ## Plot density maps as overlay of all range posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = eel_biogeo_data$densityMaps, fsize = 0.7) abline(v = max(phytools::nodeHeights(eel_biogeo_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_biogeo_data_10My$densityMaps, fsize = 0.7) # ----- Example 2: Using all ranges ----- # # (May take several minutes to run) ## Extract trait data and update densityMaps_all_ranges for the given focal_time # Extract from the densityMaps eel_biogeo_data_10My <- extract_most_likely_ranges_from_densityMaps_for_focal_time( densityMaps = eel_biogeo_data$densityMaps_all_ranges, # ace = eel_biogeo_data$ace_all_ranges, focal_time = focal_time, update_densityMaps = TRUE) ## Print trait data str(eel_biogeo_data_10My, 1) eel_biogeo_data_10My$trait_data ## Plot density maps as overlay of all range posterior probabilities # Plot initial density maps with ACE pies root_age <- max(phytools::nodeHeights(eel_biogeo_data$densityMaps_all_ranges[[1]]$tree)[,2]) plot_densityMaps_overlay(densityMaps = eel_biogeo_data$densityMaps_all_ranges, fsize = 0.7) abline(v = root_age - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_biogeo_data_10My$densityMaps, fsize = 0.7)
Extracts the most likely states found along branches
at a specific time in the past (i.e. the focal_time).
Optionally, the function can update the mapped phylogeny (densityMaps)
such as branches overlapping the focal_time are shorten to the focal_time,
and the trait mapping for the cut off branches are removed
by updating the $tree$maps and $tree$mapped.edge elements.
extract_most_likely_states_from_densityMaps_for_focal_time( densityMaps, ace = NULL, tip_data = NULL, focal_time, update_densityMaps = FALSE, keep_tip_labels = TRUE )extract_most_likely_states_from_densityMaps_for_focal_time( densityMaps, ace = NULL, tip_data = NULL, focal_time, update_densityMaps = FALSE, keep_tip_labels = TRUE )
densityMaps |
List of objects of class |
ace |
(Optional) Numerical matrix that record the posterior probabilities of ancestral states at internal nodes,
obtained with |
tip_data |
(Optional) Named character string vector of tip states. Names are nodes_ID of the internal nodes. Needed to provide accurate tip values. |
focal_time |
Integer. The time, in terms of time distance from the present, at which the tree and mapping must be cut. It must be smaller than the root age of the phylogeny. |
update_densityMaps |
Logical. Specify whether the mapped phylogeny ( |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip
must retained their initial |
The mapped phylogeny (densityMaps) is cut at a specific time in the past
(i.e. the focal_time) and the current trait values of the overlapping edges/branches are extracted.
—– Extract trait_data —–
Most likely states are extracted from the posterior probabilities displayed in the densityMaps.
The state with the highest probability is assigned to each tip and cut branches at focal_time.
True ML estimates will be used if tip_data and/or ace are provided as optional inputs.
In practice the discrepancy is negligible.
—– Update the densityMaps —–
To obtain updated densityMaps alongside the trait data, set update_densityMaps = TRUE.
The update consists in cutting off branches and mapping that are younger than the focal_time.
When a branch with a single descendant tip is cut and keep_tip_labels = TRUE,
the leaf left is labeled with the tip.label of the unique descendant tip.
When a branch with a single descendant tip is cut and keep_tip_labels = FALSE,
the leaf left is labeled with the node ID of the unique descendant tip.
In all cases, when a branch with multiple descendant tips (i.e., a clade) is cut, the leaf left is labeled with the node ID of the MRCA of the cut-off clade.
The categorical trait mapping in densityMap ($tree$maps and $tree$mapped.edge) is updated accordingly by removing mapping associated with the cut off branches.
By default, the function returns a list with three elements.
$trait_data A named character string vector with ML states found along branches overlapping the focal_time. Names are the tip.label/tipward node ID.
$focal_time Integer. The time, in terms of time distance from the present, at which the trait data were extracted.
$trait_data_type Character string. Define the type of trait data as "categorical". Used in downstream analyses to select appropriate statistical processing.
If update_densityMaps = TRUE, the output is a list with four elements: $trait_data, $focal_time, $trait_data_type, and $densityMaps.
$densityMaps A list of objects of class "densityMap" that contains the updated densityMap of each state,
with branches and mapping that are younger than the focal_time cut off.
The function also adds multiple useful sub-elements to the $densityMaps$tree elements.
$root_age Integer. Stores the age of the root of the tree.
$nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
$initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
$edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
$initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
Maël Doré
cut_phylo_for_focal_time() cut_densityMaps_for_focal_time()
Associated main function: extract_most_likely_trait_values_for_focal_time()
Sub-functions for other types of trait data:
extract_most_likely_trait_values_from_contMap_for_focal_time()
extract_most_likely_ranges_from_densityMaps_for_focal_time()
# ----- Example 1: Only extent taxa (Ultrametric tree) ----- # ## Load categorical trait data mapped on a phylogeny data(eel_cat_3lvl_data, package = "deepSTRAPP") # Explore data str(eel_cat_3lvl_data, 1) eel_cat_3lvl_data$densityMaps # Three density maps: one per state # Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps eel_cat_3lvl_data_10My <- extract_most_likely_states_from_densityMaps_for_focal_time( densityMaps = eel_cat_3lvl_data$densityMaps, # ace = eel_cat_3lvl_data$ace, focal_time = focal_time, update_densityMaps = TRUE) ## Print trait data str(eel_cat_3lvl_data_10My, 1) eel_cat_3lvl_data_10My$trait_data ## Plot density maps as overlay of all state posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = eel_cat_3lvl_data$densityMaps, fsize = 0.5) abline(v = max(phytools::nodeHeights(eel_cat_3lvl_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_cat_3lvl_data_10My$densityMaps, fsize = 0.7) # ----- Example 2: Include fossils (Non-ultrametric tree) ----- # ## Test with non-ultrametric trees like mammals in motmot ## Prepare data # Load mammals phylogeny and data from the R package motmot included within deepSTRAPP # Data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data("mammals", package = "deepSTRAPP") # Obtain mammal tree mammals_tree <- mammals$mammal.phy # Convert mass data into categories mammals_mass <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] mammals_data <- mammals_mass mammals_data[seq_along(mammals_data)] <- "small" mammals_data[mammals_mass > 5] <- "medium" mammals_data[mammals_mass > 10] <- "large" table(mammals_data) # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an equal-rates (ER) Mk model mammals_cat_data <- prepare_trait_data(tip_data = mammals_data, phylo = mammals_tree, trait_data_type = "categorical", evolutionary_models = "ER", nb_simulations = 100, plot_map = FALSE) # Set focal time focal_time <- 80 ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps mammals_cat_data_80My <- extract_most_likely_states_from_densityMaps_for_focal_time( densityMaps = mammals_cat_data$densityMaps, focal_time = focal_time, update_densityMaps = TRUE) ## Print trait data str(mammals_cat_data_80My, 1) mammals_cat_data_80My$trait_data ## Plot density maps as overlay of all state posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = mammals_cat_data$densityMaps, fsize = 0.7) abline(v = max(phytools::nodeHeights(mammals_cat_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(mammals_cat_data_80My$densityMaps, fsize = 0.8)# ----- Example 1: Only extent taxa (Ultrametric tree) ----- # ## Load categorical trait data mapped on a phylogeny data(eel_cat_3lvl_data, package = "deepSTRAPP") # Explore data str(eel_cat_3lvl_data, 1) eel_cat_3lvl_data$densityMaps # Three density maps: one per state # Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps eel_cat_3lvl_data_10My <- extract_most_likely_states_from_densityMaps_for_focal_time( densityMaps = eel_cat_3lvl_data$densityMaps, # ace = eel_cat_3lvl_data$ace, focal_time = focal_time, update_densityMaps = TRUE) ## Print trait data str(eel_cat_3lvl_data_10My, 1) eel_cat_3lvl_data_10My$trait_data ## Plot density maps as overlay of all state posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = eel_cat_3lvl_data$densityMaps, fsize = 0.5) abline(v = max(phytools::nodeHeights(eel_cat_3lvl_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_cat_3lvl_data_10My$densityMaps, fsize = 0.7) # ----- Example 2: Include fossils (Non-ultrametric tree) ----- # ## Test with non-ultrametric trees like mammals in motmot ## Prepare data # Load mammals phylogeny and data from the R package motmot included within deepSTRAPP # Data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data("mammals", package = "deepSTRAPP") # Obtain mammal tree mammals_tree <- mammals$mammal.phy # Convert mass data into categories mammals_mass <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] mammals_data <- mammals_mass mammals_data[seq_along(mammals_data)] <- "small" mammals_data[mammals_mass > 5] <- "medium" mammals_data[mammals_mass > 10] <- "large" table(mammals_data) # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an equal-rates (ER) Mk model mammals_cat_data <- prepare_trait_data(tip_data = mammals_data, phylo = mammals_tree, trait_data_type = "categorical", evolutionary_models = "ER", nb_simulations = 100, plot_map = FALSE) # Set focal time focal_time <- 80 ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps mammals_cat_data_80My <- extract_most_likely_states_from_densityMaps_for_focal_time( densityMaps = mammals_cat_data$densityMaps, focal_time = focal_time, update_densityMaps = TRUE) ## Print trait data str(mammals_cat_data_80My, 1) mammals_cat_data_80My$trait_data ## Plot density maps as overlay of all state posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = mammals_cat_data$densityMaps, fsize = 0.7) abline(v = max(phytools::nodeHeights(mammals_cat_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(mammals_cat_data_80My$densityMaps, fsize = 0.8)
Extracts the most likely trait values found along branches
at a specific time in the past (i.e. the focal_time).
Optionally, the function can update the mapped phylogeny (contMap or densityMaps)
such as branches overlapping the focal_time are shorten to the focal_time,
and the trait mapping for the cut off branches are removed
by updating the $tree$maps and $tree$mapped.edge elements.
extract_most_likely_trait_values_for_focal_time( contMap = NULL, densityMaps = NULL, ace = NULL, tip_data = NULL, trait_data_type, focal_time, update_map = FALSE, keep_tip_labels = TRUE )extract_most_likely_trait_values_for_focal_time( contMap = NULL, densityMaps = NULL, ace = NULL, tip_data = NULL, trait_data_type, focal_time, update_map = FALSE, keep_tip_labels = TRUE )
contMap |
For continuous trait data. Object of class |
densityMaps |
For categorical trait or biogeographic data. List of objects of class |
ace |
(Optional) Ancestral Character Estimates (ACE) at the internal nodes.
Obtained with
|
tip_data |
(Optional) Named vector of tip values of the trait.
|
trait_data_type |
Character string. Specify the type of trait data. Must be one of "continuous", "categorical", "biogeographic". |
focal_time |
Integer. The time, in terms of time distance from the present, at which the tree and mapping must be cut. It must be smaller than the root age of the phylogeny. |
update_map |
Logical. Specify whether the mapped phylogeny ( |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip
must retained their initial |
The mapped phylogeny (contMap or densityMaps) is cut at a specific time in the past
(i.e. the focal_time) and the current trait values of the overlapping edges/branches are extracted.
—– Extract trait_data —–
For continuous trait data:
If providing only the contMap trait values at tips and internal nodes will be extracted from
the mapping of the contMap leading to a slight discrepancy with the actual tip data
and estimated ancestral character values.
True ML trait estimates will be used if tip_data and/or ace are provided as optional inputs.
In practice the discrepancy is negligible.
For categorical trait and biogeographic data:
Most likely states/ranges are extracted from the posterior probabilities displayed in the densityMaps.
The states/ranges with the highest probability is assigned to each tip and cut branches at focal_time.
True ML states/ranges will be used if tip_data and/or ace are provided as optional inputs.
In practice the discrepancy is negligible.
—– Update the contMap/densityMaps —–
To obtain an updated contMap/densityMaps alongside the trait data, set update_map = TRUE.
The update consists in cutting off branches and mapping that are younger than the focal_time.
When a branch with a single descendant tip is cut and keep_tip_labels = TRUE,
the leaf left is labeled with the tip.label of the unique descendant tip.
When a branch with a single descendant tip is cut and keep_tip_labels = FALSE,
the leaf left is labeled with the node ID of the unique descendant tip.
In all cases, when a branch with multiple descendant tips (i.e., a clade) is cut, the leaf left is labeled with the node ID of the MRCA of the cut-off clade.
The mapping in contMap/densityMaps ($tree$maps and $tree$mapped.edge) is updated accordingly by removing mapping associated with the cut off branches.
A specific sub-function (that can be used independently) is called according to the type of trait data:
For continuous traits: extract_most_likely_trait_values_from_contMap_for_focal_time()
For categorical traits: extract_most_likely_states_from_densityMaps_for_focal_time()
For biogeographic ranges: extract_most_likely_ranges_from_densityMaps_for_focal_time()
By default, the function returns a list with three elements.
$trait_data A named numerical vector with ML trait values found along branches overlapping the focal_time. Names are the tip.label/tipward node ID.
$focal_time Integer. The time, in terms of time distance from the present, at which the trait data were extracted.
$trait_data_type Character string. Define the type of trait data as "continuous", "categorical", or "biogeographic". Used in downstream analyses to select appropriate statistical processing.
If update_map = TRUE, the output is a list with four elements: $trait_data, $focal_time, $trait_data_type, and $contMap or $densityMaps.
For continuous trait data:
$contMap An object of class "contMap" that contains the updated contMap with branches and mapping that are younger than the focal_time cut off.
The function also adds multiple useful sub-elements to the $contMap$tree element.
$root_age Integer. Stores the age of the root of the tree.
$nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
$initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
$edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
$initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
For categorical trait and biogeographic data:
$densityMaps A list of objects of class "densityMap" that contains the updated densityMap of each state/range,
with branches and mapping that are younger than the focal_time cut off.
The function also adds multiple useful sub-elements to the $densityMaps$tree elements.
$root_age Integer. Stores the age of the root of the tree.
$nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
$initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
$edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
$initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
Maël Doré
cut_phylo_for_focal_time() cut_contMap_for_focal_time() cut_densityMaps_for_focal_time()
Associated sub-functions per type of trait data:
extract_most_likely_trait_values_from_contMap_for_focal_time()
extract_most_likely_states_from_densityMaps_for_focal_time()
extract_most_likely_ranges_from_densityMaps_for_focal_time()
# ----- Example 1: Continuous trait ----- # ## Prepare data # Load eel data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 library(phytools) data(eel.tree) data(eel.data) # Extract body size eel_data <- setNames(eel.data$Max_TL_cm, rownames(eel.data)) # (May take several minutes to run) ## Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes eel_ACE <- phytools::fastAnc(tree = eel.tree, x = eel_data) ## Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object eel_contMap <- phytools::contMap(eel.tree, x = eel_data, res = 100, # Number of time steps plot = FALSE) # Set focal time to 50 Mya focal_time <- 50 ## Extract trait data and update contMap for the given focal_time # Extract from the contMap (values are not exact ML estimates) eel_cont_50 <- extract_most_likely_trait_values_for_focal_time( contMap = eel_contMap, trait_data_type = "continuous", focal_time = focal_time, update_map = TRUE) # Extract from tip data and ML estimates of ancestral characters (values are true ML estimates) eel_cont_50 <- extract_most_likely_trait_values_for_focal_time( contMap = eel_contMap, ace = eel_ACE, tip_data = eel_data, trait_data_type = "continuous", focal_time = focal_time, update_map = TRUE) ## Visualize outputs # Print trait data eel_cont_50$trait_data # Plot node labels on initial stochastic map with cut-off plot(eel_contMap, fsize = c(0.5, 1)) ape::nodelabels() abline(v = max(phytools::nodeHeights(eel_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated contMap with initial node labels plot(eel_cont_50$contMap) ape::nodelabels(text = eel_cont_50$contMap$tree$initial_nodes_ID) # ----- Example 2: Categorical trait ----- # # (May take several minutes to run) ## Load categorical trait data mapped on a phylogeny data(eel_cat_3lvl_data, package = "deepSTRAPP") # Explore data str(eel_cat_3lvl_data, 1) eel_cat_3lvl_data$densityMaps # Three density maps: one per state # Set focal time to 10 Mya focal_time <- 10 ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps eel_cat_3lvl_data_10My <- extract_most_likely_trait_values_for_focal_time( densityMaps = eel_cat_3lvl_data$densityMaps, trait_data_type = "categorical", focal_time = focal_time, update_map = TRUE) ## Print trait data str(eel_cat_3lvl_data_10My, 1) eel_cat_3lvl_data_10My$trait_data ## Plot density maps as overlay of all state posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = eel_cat_3lvl_data$densityMaps) abline(v = max(phytools::nodeHeights(eel_cat_3lvl_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_cat_3lvl_data_10My$densityMaps) # ----- Example 3: Biogeographic ranges ----- # # (May take several minutes to run) ## Load biogeographic range data mapped on a phylogeny data(eel_biogeo_data, package = "deepSTRAPP") # Explore data str(eel_biogeo_data, 1) eel_biogeo_data$densityMaps # Two density maps: one per unique area: A, B. eel_biogeo_data$densityMaps_all_ranges # Three density maps: one per range: A, B, and AB. # Set focal time to 10 Mya focal_time <- 10 ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps eel_biogeo_data_10My <- extract_most_likely_trait_values_for_focal_time( densityMaps = eel_biogeo_data$densityMaps, # ace = eel_biogeo_data$ace, trait_data_type = "biogeographic", focal_time = focal_time, update_map = TRUE) ## Print trait data str(eel_biogeo_data_10My, 1) eel_biogeo_data_10My$trait_data ## Plot density maps as overlay of all range posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = eel_biogeo_data$densityMaps) abline(v = max(phytools::nodeHeights(eel_biogeo_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_biogeo_data_10My$densityMaps)# ----- Example 1: Continuous trait ----- # ## Prepare data # Load eel data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 library(phytools) data(eel.tree) data(eel.data) # Extract body size eel_data <- setNames(eel.data$Max_TL_cm, rownames(eel.data)) # (May take several minutes to run) ## Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes eel_ACE <- phytools::fastAnc(tree = eel.tree, x = eel_data) ## Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object eel_contMap <- phytools::contMap(eel.tree, x = eel_data, res = 100, # Number of time steps plot = FALSE) # Set focal time to 50 Mya focal_time <- 50 ## Extract trait data and update contMap for the given focal_time # Extract from the contMap (values are not exact ML estimates) eel_cont_50 <- extract_most_likely_trait_values_for_focal_time( contMap = eel_contMap, trait_data_type = "continuous", focal_time = focal_time, update_map = TRUE) # Extract from tip data and ML estimates of ancestral characters (values are true ML estimates) eel_cont_50 <- extract_most_likely_trait_values_for_focal_time( contMap = eel_contMap, ace = eel_ACE, tip_data = eel_data, trait_data_type = "continuous", focal_time = focal_time, update_map = TRUE) ## Visualize outputs # Print trait data eel_cont_50$trait_data # Plot node labels on initial stochastic map with cut-off plot(eel_contMap, fsize = c(0.5, 1)) ape::nodelabels() abline(v = max(phytools::nodeHeights(eel_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated contMap with initial node labels plot(eel_cont_50$contMap) ape::nodelabels(text = eel_cont_50$contMap$tree$initial_nodes_ID) # ----- Example 2: Categorical trait ----- # # (May take several minutes to run) ## Load categorical trait data mapped on a phylogeny data(eel_cat_3lvl_data, package = "deepSTRAPP") # Explore data str(eel_cat_3lvl_data, 1) eel_cat_3lvl_data$densityMaps # Three density maps: one per state # Set focal time to 10 Mya focal_time <- 10 ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps eel_cat_3lvl_data_10My <- extract_most_likely_trait_values_for_focal_time( densityMaps = eel_cat_3lvl_data$densityMaps, trait_data_type = "categorical", focal_time = focal_time, update_map = TRUE) ## Print trait data str(eel_cat_3lvl_data_10My, 1) eel_cat_3lvl_data_10My$trait_data ## Plot density maps as overlay of all state posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = eel_cat_3lvl_data$densityMaps) abline(v = max(phytools::nodeHeights(eel_cat_3lvl_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_cat_3lvl_data_10My$densityMaps) # ----- Example 3: Biogeographic ranges ----- # # (May take several minutes to run) ## Load biogeographic range data mapped on a phylogeny data(eel_biogeo_data, package = "deepSTRAPP") # Explore data str(eel_biogeo_data, 1) eel_biogeo_data$densityMaps # Two density maps: one per unique area: A, B. eel_biogeo_data$densityMaps_all_ranges # Three density maps: one per range: A, B, and AB. # Set focal time to 10 Mya focal_time <- 10 ## Extract trait data and update densityMaps for the given focal_time # Extract from the densityMaps eel_biogeo_data_10My <- extract_most_likely_trait_values_for_focal_time( densityMaps = eel_biogeo_data$densityMaps, # ace = eel_biogeo_data$ace, trait_data_type = "biogeographic", focal_time = focal_time, update_map = TRUE) ## Print trait data str(eel_biogeo_data_10My, 1) eel_biogeo_data_10My$trait_data ## Plot density maps as overlay of all range posterior probabilities # Plot initial density maps with ACE pies plot_densityMaps_overlay(densityMaps = eel_biogeo_data$densityMaps) abline(v = max(phytools::nodeHeights(eel_biogeo_data$densityMaps[[1]]$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated densityMaps with ACE pies plot_densityMaps_overlay(eel_biogeo_data_10My$densityMaps)
Extracts the most likely trait values found along branches
at a specific time in the past (i.e. the focal_time).
Optionally, the function can update the mapped phylogeny (contMap) such as
branches overlapping the focal_time are shorten to the focal_time, and
the continuous trait mapping for the cut off branches are removed
by updating the $tree$maps and $tree$mapped.edge elements.
extract_most_likely_trait_values_from_contMap_for_focal_time( contMap, ace = NULL, tip_data = NULL, focal_time, update_contMap = FALSE, keep_tip_labels = TRUE )extract_most_likely_trait_values_from_contMap_for_focal_time( contMap, ace = NULL, tip_data = NULL, focal_time, update_contMap = FALSE, keep_tip_labels = TRUE )
contMap |
Object of class |
ace |
Named numeric vector (Optional). Ancestral Character Estimates (ACE) of the internal nodes,
typically generated with |
tip_data |
Named numeric vector (Optional). Tip values of the trait. Names are nodes_ID of the internal nodes. Needed to provide accurate tip values. |
focal_time |
Integer. The time, in terms of time distance from the present, at which the tree and mapping must be cut. It must be smaller than the root age of the phylogeny. |
update_contMap |
Logical. Specify whether the mapped phylogeny ( |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip
must retained their initial |
The mapped phylogeny (contMap) is cut at a specific time in the past
(i.e. the focal_time) and the current trait values of the overlapping edges/branches are extracted.
—– Extract trait_data —–
If providing only the contMap trait values at tips and internal nodes will be extracted from
the mapping of the contMap leading to a slight dependency with the actual tip data
and estimated ancestral character values.
True ML estimates will be used if tip_data and/or ace are provided as optional inputs.
In practice the discrepancy is negligible.
—– Update the contMap —–
To obtain an updated contMap alongside the trait data, set update_contMap = TRUE.
The update consists in cutting off branches and mapping that are younger than the focal_time.
When a branch with a single descendant tip is cut and keep_tip_labels = TRUE,
the leaf left is labeled with the tip.label of the unique descendant tip.
When a branch with a single descendant tip is cut and keep_tip_labels = FALSE,
the leaf left is labeled with the node ID of the unique descendant tip.
In all cases, when a branch with multiple descendant tips (i.e., a clade) is cut, the leaf left is labeled with the node ID of the MRCA of the cut-off clade.
The continuous trait mapping contMap ($tree$maps and $tree$mapped.edge) is updated accordingly by removing mapping associated with the cut off branches.
By default, the function returns a list with three elements.
$trait_data A named numerical vector with ML trait values found along branches overlapping the focal_time. Names are the tip.label/tipward node ID.
$focal_time Integer. The time, in terms of time distance from the present, at which the trait data were extracted.
$trait_data_type Character string. Define the type of trait data as "continuous". Used in downstream analyses to select appropriate statistical processing.
If update_contMap = TRUE, the output is a list with four elements: $trait_data, $focal_time, $trait_data_type, and $contMap.
$contMap An object of class that contains the updated contMap with branches and mapping that are younger than the focal_time cut off.
The function also adds multiple useful sub-elements to the $contMap$tree element.
$root_age Integer. Stores the age of the root of the tree.
$nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
$initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
$edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
$initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
Maël Doré
cut_phylo_for_focal_time() cut_contMap_for_focal_time()
Associated main function: extract_most_likely_trait_values_for_focal_time()
Sub-functions for other types of trait data:
extract_most_likely_states_from_densityMaps_for_focal_time()
extract_most_likely_ranges_from_densityMaps_for_focal_time()
# ----- Example 1: Only extent taxa (Ultrametric tree) ----- # ## Prepare data # Load eel data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 library(phytools) data(eel.tree) data(eel.data) # Extract body size eel_data <- setNames(eel.data$Max_TL_cm, rownames(eel.data)) # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes eel_ACE <- phytools::fastAnc(tree = eel.tree, x = eel_data) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object eel_contMap <- phytools::contMap(eel.tree, x = eel_data, res = 100, # Number of time steps plot = FALSE) # Set focal time to 50 Mya focal_time <- 50 # (May take several minutes to run) ## Extract trait data and update contMap for the given focal_time # Extract from the contMap (values are not exact ML estimates) eel_test <- extract_most_likely_trait_values_from_contMap_for_focal_time( contMap = eel_contMap, focal_time = focal_time, update_contMap = TRUE) # Extract from tip data and ML estimates of ancestral characters (values are true ML estimates) eel_test <- extract_most_likely_trait_values_from_contMap_for_focal_time( contMap = eel_contMap, ace = eel_ACE, tip_data = eel_data, focal_time = focal_time, update_contMap = TRUE) ## Visualize outputs # Print trait data eel_test$trait_data # Plot node labels on initial stochastic map with cut-off plot(eel_contMap, fsize = c(0.5, 1)) ape::nodelabels() abline(v = max(phytools::nodeHeights(eel_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated contMap with initial node labels plot(eel_test$contMap) ape::nodelabels(text = eel_test$contMap$tree$initial_nodes_ID) # ----- Example 2: Include fossils (Non-ultrametric tree) ----- # ## Test with non-ultrametric trees like mammals in motmot ## Prepare data # Load mammals phylogeny and data from the R package motmot included within deepSTRAPP # Data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data("mammals", package = "deepSTRAPP") mammals_tree <- mammals$mammal.phy mammals_data <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes mammals_ACE <- phytools::fastAnc(tree = mammals_tree, x = mammals_data) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object mammals_contMap <- phytools::contMap(mammals_tree, x = mammals_data, res = 100, # Number of time steps plot = FALSE) # Set focal time to 80 Mya focal_time <- 80 # (May take several minutes to run) ## Extract trait data and update contMap for the given focal_time # Extract from the contMap (values are not exact ML estimates) mammals_test <- extract_most_likely_trait_values_from_contMap_for_focal_time( contMap = mammals_contMap, focal_time = focal_time, update_contMap = TRUE) # Extract from tip data and ML estimates of ancestral characters (values are true ML) mammals_test <- extract_most_likely_trait_values_from_contMap_for_focal_time( contMap = mammals_contMap, ace = mammals_ACE, tip_data = mammals_data, focal_time = focal_time, update_contMap = TRUE) ## Visualize outputs # Print trait data mammals_test$trait_data # Plot node labels on initial stochastic map with cut-off phytools::plot.contMap(mammals_contMap, fsize = c(0.5, 1)) ape::nodelabels() abline(v = max(phytools::nodeHeights(mammals_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated contMap with initial node labels phytools::plot.contMap(mammals_test$contMap, fsize = c(0.8, 1)) ape::nodelabels(text = mammals_test$contMap$tree$initial_nodes_ID)# ----- Example 1: Only extent taxa (Ultrametric tree) ----- # ## Prepare data # Load eel data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 library(phytools) data(eel.tree) data(eel.data) # Extract body size eel_data <- setNames(eel.data$Max_TL_cm, rownames(eel.data)) # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes eel_ACE <- phytools::fastAnc(tree = eel.tree, x = eel_data) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object eel_contMap <- phytools::contMap(eel.tree, x = eel_data, res = 100, # Number of time steps plot = FALSE) # Set focal time to 50 Mya focal_time <- 50 # (May take several minutes to run) ## Extract trait data and update contMap for the given focal_time # Extract from the contMap (values are not exact ML estimates) eel_test <- extract_most_likely_trait_values_from_contMap_for_focal_time( contMap = eel_contMap, focal_time = focal_time, update_contMap = TRUE) # Extract from tip data and ML estimates of ancestral characters (values are true ML estimates) eel_test <- extract_most_likely_trait_values_from_contMap_for_focal_time( contMap = eel_contMap, ace = eel_ACE, tip_data = eel_data, focal_time = focal_time, update_contMap = TRUE) ## Visualize outputs # Print trait data eel_test$trait_data # Plot node labels on initial stochastic map with cut-off plot(eel_contMap, fsize = c(0.5, 1)) ape::nodelabels() abline(v = max(phytools::nodeHeights(eel_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated contMap with initial node labels plot(eel_test$contMap) ape::nodelabels(text = eel_test$contMap$tree$initial_nodes_ID) # ----- Example 2: Include fossils (Non-ultrametric tree) ----- # ## Test with non-ultrametric trees like mammals in motmot ## Prepare data # Load mammals phylogeny and data from the R package motmot included within deepSTRAPP # Data source: Slater, 2013; DOI: 10.1111/2041-210X.12084 data("mammals", package = "deepSTRAPP") mammals_tree <- mammals$mammal.phy mammals_data <- setNames(object = mammals$mammal.mass$mean, nm = row.names(mammals$mammal.mass))[mammals_tree$tip.label] # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes mammals_ACE <- phytools::fastAnc(tree = mammals_tree, x = mammals_data) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object mammals_contMap <- phytools::contMap(mammals_tree, x = mammals_data, res = 100, # Number of time steps plot = FALSE) # Set focal time to 80 Mya focal_time <- 80 # (May take several minutes to run) ## Extract trait data and update contMap for the given focal_time # Extract from the contMap (values are not exact ML estimates) mammals_test <- extract_most_likely_trait_values_from_contMap_for_focal_time( contMap = mammals_contMap, focal_time = focal_time, update_contMap = TRUE) # Extract from tip data and ML estimates of ancestral characters (values are true ML) mammals_test <- extract_most_likely_trait_values_from_contMap_for_focal_time( contMap = mammals_contMap, ace = mammals_ACE, tip_data = mammals_data, focal_time = focal_time, update_contMap = TRUE) ## Visualize outputs # Print trait data mammals_test$trait_data # Plot node labels on initial stochastic map with cut-off phytools::plot.contMap(mammals_contMap, fsize = c(0.5, 1)) ape::nodelabels() abline(v = max(phytools::nodeHeights(mammals_contMap$tree)[,2]) - focal_time, col = "red", lty = 2, lwd = 2) # Plot updated contMap with initial node labels phytools::plot.contMap(mammals_test$contMap, fsize = c(0.8, 1)) ape::nodelabels(text = mammals_test$contMap$tree$initial_nodes_ID)
Detect whether the current deepSTRAPP installation corresponds to a development version (e.g., version sourced from GitHub), as opposed to a CRAN release.
The development versions include deepSTRAPP outputs as additional internal datasets to help produces examples and vignettes outputs. These additional datasets are removed from the CRAN releases because their size is not compatible with CRAN policies.
This function is only used to check if an example must be ran or a vignette chunk evaluated to produce output from data.
is_dev_version(pkg = "deepSTRAPP")is_dev_version(pkg = "deepSTRAPP")
pkg |
Character string. Name of the R package for which the version type is inspected. Default is "deepSTRAPP". |
Logical. TRUE if running a development version.
Maël Doré
# Check the current deepSTRAPP installation is a development version is_dev_version(pkg = "deepSTRAPP")# Check the current deepSTRAPP installation is a development version is_dev_version(pkg = "deepSTRAPP")
A list containing two elements:
$mammal.mass A data.frame with mean and standard error of mammal body masses in ln(mass in g).
$mammal.phy A time-calibrated phylogeny of extinct and extant mammal families/genera.
Source: Slater, G. J. (2013). Phylogenetic evidence for a shift in the mode of mammalian body size evolution at the Cretaceous-Palaeogene boundary. Methods in Ecology and Evolution, 4(8), 734-744. doi:10.1111/2041-210X.12084
data(mammals)data(mammals)
A list with 2 elements.
$mammal.mass A data.frame of 213 observations and two columns. Values are numerical.
$mammal.phy A time-calibrated phylogeny of 211 tips.
Initial dataset from Slater, G. J. (2013). The R object was imported from R package motmot to include in deepSTRAPP.
Plot on a time-calibrated phylogeny the evolution of diversification rates and
the location of regime shifts estimated from a BAMM (Bayesian Analysis of Macroevolutionary Mixtures).
Each branch is colored accroding to the estimated rates of speciation, extinction, or net diversification
stored in an object of class bammdata. Rates can vary along time, thus colors evolved along individual branches.
This function is a wrapper of original functions from the R package {BAMMtools}:
Step 1: Use BAMMtools::plot.bammdata() to map rates on the phylogeny.
Step 2: Add the location of regime shifts with BAMMtools::addBAMMshifts() (if add_regime_shifts = TRUE).
plot_BAMM_rates( BAMM_object, rate_type = "net_diversification", method = "phylogram", add_regime_shifts = TRUE, configuration_type = "MAP", sample_index = 1, adjust_size_to_prob = TRUE, regimes_fill = "grey", regimes_size = 1, regimes_pch = 21, regimes_border_col = "black", regimes_border_width = 1, ..., display_plot = TRUE, PDF_file_path = NULL )plot_BAMM_rates( BAMM_object, rate_type = "net_diversification", method = "phylogram", add_regime_shifts = TRUE, configuration_type = "MAP", sample_index = 1, adjust_size_to_prob = TRUE, regimes_fill = "grey", regimes_size = 1, regimes_pch = 21, regimes_border_col = "black", regimes_border_width = 1, ..., display_plot = TRUE, PDF_file_path = NULL )
BAMM_object |
Object of class |
rate_type |
A character string specifying the type of diversification rates to plot. Must be one of 'speciation', 'extinction' or 'net_diversification' (default). |
method |
A character string indicating the method for plotting the phylogenetic tree.
|
add_regime_shifts |
Logical. Whether to add the location of regime shifts on the phylogeny (Step 2). Default is |
configuration_type |
A character string specifying how to select the location of regime shifts across posterior samples.
|
sample_index |
Integer. Index of the posterior samples to use to plot the location of regime shifts.
Used only if |
adjust_size_to_prob |
Logical. Whether to scale the size of the symbols showing the location of regime shifts according to
the marginal shift probability of the shift happening on each location/branch. This will only works if there is an |
regimes_fill |
Character string. Set the color of the background of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_size |
Numerical. Set the size of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_pch |
Integer. Set the shape of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_border_col |
Character string. Set the color of the border of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_border_width |
Numerical. Set the width of the border of the symbols showing the location of regime shifts.
Equivalent to the |
... |
Additional graphical arguments to pass down to |
display_plot |
Logical. Whether to display the plot generated in the R console. Default is |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
The main input BAMM_object is the typical output of prepare_diversification_data().
It provides information on rates and regimes shifts across the posterior samples of a BAMM.
$MAP_BAMM_object and $MSC_BAMM_object elements are required to plot regime shift locations following the
"MAP" or "MSC" configuration_type respectively.
A $MSP_tree element is required to scale the size of the symbols showing the location of regime shifts according marginal shift probabilities.
(If adjust_size_to_prob = TRUE).
The default option to display regime shift is to use the average locations from the posterior samples with the Maximum A Posteriori probability (MAP) configuration.
However, sometimes, multiple configurations have similarly high frequency in the posterior samples (See BAMMtools::credibleShiftSet().
An alternative is to use the average locations from posterior samples with the Maximum Shift Credibility (MSC) configuration instead.
This regime shift configuration has the highest product of marginal probabilities across branches where a shift is estimated.
It may differ from the MAP configuration. (See BAMMtools::maximumShiftCredibility()).
The function returns (invisibly) a list with three three elements similarly to BAMMtools::plot.bammdata().
$coords: A matrix of plot coordinates. Rows correspond to branches. Columns 1-2 are starting (x,y) coordinates of each branch and columns 3-4 are ending (x,y) coordinates of each branch. If method = "polar" a fifth column gives the angle(in radians) of each branch.
$colorbreaks: A vector of percentiles used to group macroevolutionary rates into color bins.
$colordens: A matrix of the kernel density estimates (column 2) of evolutionary rates (column 1) and the color (column 3) corresponding to each rate value.
Maël Doré
Original functions by Mike Grundler & Pascal Title in R package {BAMMtools}.
Initial functions in BAMMtools: BAMMtools::plot.bammdata() BAMMtools::addBAMMshifts()
Associated functions in deepSTRAPP: prepare_diversification_data() update_rates_and_regimes_for_focal_time() run_deepSTRAPP_for_focal_time() run_deepSTRAPP_over_time()
# Load BAMM output data(whale_BAMM_object, package = "deepSTRAPP") ## Plot overall mean rates with MAP configuration for regime shifts # (rates are averaged only all posterior samples) plot_BAMM_rates(whale_BAMM_object, add_regime_shifts = TRUE, configuration_type = "MAP", bg = "black", regimes_size = 3) ## Plot overall mean rates with MSC configuration for regime shifts # (rates are averaged only all posterior samples) plot_BAMM_rates(whale_BAMM_object, add_regime_shifts = TRUE, configuration_type = "MSC", bg = "black", regimes_size = 3) ## Plot mean MAP rates with regime shifts # (rates are averaged only across MAP samples) plot_BAMM_rates(whale_BAMM_object$MAP_BAMM_object, add_regime_shifts = TRUE, configuration_type = "index", # Set to index to use the regime shift data from the '$MAP_BAMM_object' regimes_size = 3, bg = "black") ## Plot mean MSC rates (rates averaged only across MSC samples) with regime shifts # (rates averaged only across MSC samples) plot_BAMM_rates(whale_BAMM_object$MSC_BAMM_object, add_regime_shifts = TRUE, configuration_type = "index", # Set to index to use the regime shift data from the '$MSC_BAMM_object' regimes_size = 3, bg = "black")# Load BAMM output data(whale_BAMM_object, package = "deepSTRAPP") ## Plot overall mean rates with MAP configuration for regime shifts # (rates are averaged only all posterior samples) plot_BAMM_rates(whale_BAMM_object, add_regime_shifts = TRUE, configuration_type = "MAP", bg = "black", regimes_size = 3) ## Plot overall mean rates with MSC configuration for regime shifts # (rates are averaged only all posterior samples) plot_BAMM_rates(whale_BAMM_object, add_regime_shifts = TRUE, configuration_type = "MSC", bg = "black", regimes_size = 3) ## Plot mean MAP rates with regime shifts # (rates are averaged only across MAP samples) plot_BAMM_rates(whale_BAMM_object$MAP_BAMM_object, add_regime_shifts = TRUE, configuration_type = "index", # Set to index to use the regime shift data from the '$MAP_BAMM_object' regimes_size = 3, bg = "black") ## Plot mean MSC rates (rates averaged only across MSC samples) with regime shifts # (rates averaged only across MSC samples) plot_BAMM_rates(whale_BAMM_object$MSC_BAMM_object, add_regime_shifts = TRUE, configuration_type = "index", # Set to index to use the regime shift data from the '$MSC_BAMM_object' regimes_size = 3, bg = "black")
Plot on a time-calibrated phylogeny the evolution of a continuous trait as
summarized in a contMap object typically generated with prepare_trait_data().
This function is a wrapper of original functions from the R package {phytools}:
Step 1: Use phytools::setMap() to update the color scale if requested.
Step 2: Use phytools::plot.contMap() to plot the mapped phylogeny.
plot_contMap( contMap, color_scale = NULL, ..., display_plot = TRUE, PDF_file_path = NULL )plot_contMap( contMap, color_scale = NULL, ..., display_plot = TRUE, PDF_file_path = NULL )
contMap |
List of class |
color_scale |
Vector of character string. List of colors to use to build the color scale with |
... |
Additional arguments to pass down to |
display_plot |
Logical. Whether to display the plot generated in the R console. Default is |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
This function is a wrapper of original functions phytools::setMap() and phytools::plot.contMap(). Additions are listed below:
The color scale can be controlled directly with the argument color_scale.
The plot can be exported in PDF using PDF_file_path to define the output file.
If display_plot = TRUE, the function plots the time-calibrated phylogeny displaying the evolution of a continuous trait.
If PDF_file_path is provided, the function exports the plot into a PDF file.
An object of class "contMap" with an (optionally) updated color scale ($cols) is returned invisibly.
Maël Doré
Original functions by Liam Revell in R package {phytools}. Contact: [email protected]
phytools::plot.contMap() plot_densityMaps_overlay()
# Load phylogeny data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load trait df data(Ponerinae_tree, package = "deepSTRAPP") ## Prepare trait data # (May take several minutes to run) # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree, x = Ponerinae_cont_tip_data) # Infer Ancestral Character Estimates based on a Brownian Motion model # and run a Stochastic Mapping to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap with the phytools method plot(x = Ponerinae_contMap, fsize = c(0.5, 1)) # Plot contMap with an updated color scale plot_contMap(contMap = Ponerinae_contMap, fsize = c(0.5, 1), color_scale = c("darkgreen", "limegreen", "orange", "red")) # PDF_file_path = "Ponerinae_contMap.pdf")# Load phylogeny data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load trait df data(Ponerinae_tree, package = "deepSTRAPP") ## Prepare trait data # (May take several minutes to run) # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree, x = Ponerinae_cont_tip_data) # Infer Ancestral Character Estimates based on a Brownian Motion model # and run a Stochastic Mapping to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap with the phytools method plot(x = Ponerinae_contMap, fsize = c(0.5, 1)) # Plot contMap with an updated color scale plot_contMap(contMap = Ponerinae_contMap, fsize = c(0.5, 1), color_scale = c("darkgreen", "limegreen", "orange", "red")) # PDF_file_path = "Ponerinae_contMap.pdf")
Plot on a time-calibrated phylogeny the evolution of a categorical trait/biogeographic ranges
summarized from densityMaps typically generated with prepare_trait_data().
Each branch is colored according to the posterior probability of being in a given state/range.
Color for each state/range are overlaid using transparency to produce a single plot for all states/ranges.
plot_densityMaps_overlay( densityMaps, colors_per_levels = NULL, add_ACE_pies = TRUE, cex_pies = 0.5, ace = NULL, ..., display_plot = TRUE, PDF_file_path = NULL )plot_densityMaps_overlay( densityMaps, colors_per_levels = NULL, add_ACE_pies = TRUE, cex_pies = 0.5, ace = NULL, ..., display_plot = TRUE, PDF_file_path = NULL )
densityMaps |
List of objects of class |
colors_per_levels |
Named character string. To set the colors to use to map each state/range posterior probabilities. Names = states/ranges; values = colors.
If |
add_ACE_pies |
Logical. Whether to add pies of posterior probabilities of states/ranges at internal nodes on the mapped phylogeny. Default = |
cex_pies |
Numerical. To adjust the size of the ACE pies. Default = |
ace |
Numerical matrix. To provide the posterior probabilities of ancestral states/ranges (characters) estimates (ACE) at internal nodes
used to plot the ACE pies. Rows are internal nodes. Columns are states/ranges. Values are posterior probabilities of each state per node.
Typically generated with |
... |
Additional arguments to pass down to |
display_plot |
Logical. Whether to display the plot generated in the R console. Default is |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
If display_plot = TRUE, the function plots a time-calibrated phylogeny displaying the evolution of a categorical trait/biogeographic ranges.
If PDF_file_path is provided, the function exports the plot into a PDF file.
Maël Doré
Original functions by Liam Revell in R package {phytools}. Contact: [email protected]
phytools::plot.densityMap() phytools::plotSimmap()
# Load phylogeny and tip data library(phytools) data(eel.tree) data(eel.data) # Transform feeding mode data into a 3-level factor eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) eel_data <- as.character(eel_data) eel_data[c(1, 5, 6, 7, 10, 11, 15, 16, 17, 24, 25, 28, 30, 51, 52, 53, 55, 58, 60)] <- "kiss" eel_data <- stats::setNames(eel_data, rownames(eel.data)) table(eel_data) # Manually define a Q_matrix for rate classes of state transition to use in the 'matrix' model # Does not allow transitions from state 1 ("bite") to state 2 ("kiss") or state 3 ("suction") # Does not allow transitions from state 3 ("suction") to state 1 ("bite") # Set symmetrical rates between state 2 ("kiss") and state 3 ("suction") Q_matrix = rbind(c(NA, 0, 0), c(1, NA, 2), c(0, 2, NA)) # Set colors per states colors_per_levels <- c("limegreen", "orange", "dodgerblue") names(colors_per_levels) <- c("bite", "kiss", "suction") # (May take several minutes to run) # Run evolutionary models to prepare trait data eel_cat_3lvl_data <- prepare_trait_data(tip_data = eel_data, phylo = eel.tree, trait_data_type = "categorical", colors_per_levels = colors_per_levels, evolutionary_models = c("ER", "SYM", "ARD", "meristic", "matrix"), Q_matrix = Q_matrix, nb_simulations = 1000, plot_map = TRUE, plot_overlay = TRUE, return_best_model_fit = TRUE, return_model_selection_df = TRUE) # Load directly output data(eel_cat_3lvl_data, package = "deepSTRAPP") # Plot densityMaps one by one plot(eel_cat_3lvl_data$densityMaps[[1]]) # densityMap for state n°1 ("bite") plot(eel_cat_3lvl_data$densityMaps[[2]]) # densityMap for state n°1 ("kiss") plot(eel_cat_3lvl_data$densityMaps[[3]]) # densityMap for state n°1 ("suction") # Plot overlay of all densityMaps plot_densityMaps_overlay(densityMaps = eel_cat_3lvl_data$densityMaps)# Load phylogeny and tip data library(phytools) data(eel.tree) data(eel.data) # Transform feeding mode data into a 3-level factor eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) eel_data <- as.character(eel_data) eel_data[c(1, 5, 6, 7, 10, 11, 15, 16, 17, 24, 25, 28, 30, 51, 52, 53, 55, 58, 60)] <- "kiss" eel_data <- stats::setNames(eel_data, rownames(eel.data)) table(eel_data) # Manually define a Q_matrix for rate classes of state transition to use in the 'matrix' model # Does not allow transitions from state 1 ("bite") to state 2 ("kiss") or state 3 ("suction") # Does not allow transitions from state 3 ("suction") to state 1 ("bite") # Set symmetrical rates between state 2 ("kiss") and state 3 ("suction") Q_matrix = rbind(c(NA, 0, 0), c(1, NA, 2), c(0, 2, NA)) # Set colors per states colors_per_levels <- c("limegreen", "orange", "dodgerblue") names(colors_per_levels) <- c("bite", "kiss", "suction") # (May take several minutes to run) # Run evolutionary models to prepare trait data eel_cat_3lvl_data <- prepare_trait_data(tip_data = eel_data, phylo = eel.tree, trait_data_type = "categorical", colors_per_levels = colors_per_levels, evolutionary_models = c("ER", "SYM", "ARD", "meristic", "matrix"), Q_matrix = Q_matrix, nb_simulations = 1000, plot_map = TRUE, plot_overlay = TRUE, return_best_model_fit = TRUE, return_model_selection_df = TRUE) # Load directly output data(eel_cat_3lvl_data, package = "deepSTRAPP") # Plot densityMaps one by one plot(eel_cat_3lvl_data$densityMaps[[1]]) # densityMap for state n°1 ("bite") plot(eel_cat_3lvl_data$densityMaps[[2]]) # densityMap for state n°1 ("kiss") plot(eel_cat_3lvl_data$densityMaps[[3]]) # densityMap for state n°1 ("suction") # Plot overlay of all densityMaps plot_densityMaps_overlay(densityMaps = eel_cat_3lvl_data$densityMaps)
Plot an histogram of the distribution of the test statistics
obtained from a STRAPP test carried out for a unique focal_time.
(See compute_STRAPP_test_for_focal_time() and
run_deepSTRAPP_for_focal_time()).
Returns a single histogram for overall tests.
If plot_posthoc_tests = TRUE, it will return a faceted plot with an histogram per post hoc tests.
If a PDF file path is provided in PDF_file_path, the plot will be saved directly in a PDF file.
plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs, focal_time = NULL, display_plot = TRUE, plot_posthoc_tests = FALSE, PDF_file_path = NULL )plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs, focal_time = NULL, display_plot = TRUE, plot_posthoc_tests = FALSE, PDF_file_path = NULL )
deepSTRAPP_outputs |
List of elements generated with |
focal_time |
Numerical. (Optional) If |
display_plot |
Logical. Whether to display the histogram(s) generated in the R console. Default is |
plot_posthoc_tests |
Logical. For multinominal data only. Whether to plot the histogram for the overall Kruskal-Wallis test across all states ( |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
The main input deepSTRAPP_outputs is the typical output of run_deepSTRAPP_for_focal_time().
It provides information on results of a STRAPP test performed at a given focal_time.
Histograms are built based on the distribution of the test statistics.
Such distributions are recorded in the outputs of a deepSTRAPP run carried out with run_deepSTRAPP_for_focal_time()
when return_perm_data = TRUE so that the distributions of test stats computed across posterior samples are returned
among the outputs under $STRAPP_results$perm_data_df.
For multinominal data (categorical or biogeographic data with more than 2 states), it is possible to plot the histograms of post hoc pairwise tests.
Set plot_posthoc_tests = TRUE to generate histograms for all the pairwise post hoc Dunn's test across pairs of states.
To achieve this, the deepSTRAPP_outputs input object must contain a $STRAPP_results$posthoc_pairwise_tests$perm_data_array element
that summarizes test statistics computed across posterior samples for all pairwise post hoc tests.
This is obtained from run_deepSTRAPP_for_focal_time() when setting both posthoc_pairwise_tests = TRUE to carry out post hoc tests,
and return_perm_data = TRUE to record distributions of test statistics.
Alternatively, the main input deepSTRAPP_outputs can be the output of run_deepSTRAPP_over_time(),
providing results of STRAPP tests over multiple time-steps. In this case, you must provide a focal_time to select the
unique time-step used for plotting.
return_perm_data must be set to TRUE so that the permutation data used to compute the tests are returned among the outputs
under $STRAPP_results_over_time[[i]]$perm_data_df.
posthoc_pairwise_tests must be set to TRUE so that the permutation data used to performed the post hoc tests are also
returned among the outputs under $STRAPP_results_over_time[[i]]$posthoc_pairwise_tests$perm_data_array.
For plotting all time-steps at once, see plot_histograms_STRAPP_tests_over_time().
By default, the function returns a list of classes gg and ggplot.
This object is a ggplot that can be displayed on the console with print(output).
It corresponds to the histogram being displayed on the console when the function is run, if display_plot = TRUE, and can be further
modify for aesthetics using the ggplot2 grammar.
If using multinominal data and set plot_posthoc_tests = TRUE, the function will return a list of objects.
Each object is the ggplot associated with a pairwise post hoc test.
To plot each histogram i individually, use print(output_list[[i]]).
To plot all histograms at once in a multifaceted plot, as displayed on the console if display_plot = TRUE, use cowplot::plot_grid(plotlist = output_list).
Each plot also displays summary statistics for the STRAPP test associated with the data displayed.
The quantile of null statistic distribution at the significant threshold used to define test significance. This is the value found on the red dashed line. The test will be considered significant (i.e., the null hypothesis is rejected) if this value is higher than zero (the black dashed line).
The p-value of the STRAPP test which correspond the proportion of cases in which the statistics was lower than expected under the null hypothesis (i.e., the proportion of the histogram found below / on the left-side of the black dashed line).
If a PDF_file_path is provided, the function will also generate a PDF file of the plot. For post hoc tests, this will save the multifaceted plot.
Maël Doré
Associated functions in deepSTRAPP: run_deepSTRAPP_for_focal_time() plot_histograms_STRAPP_tests_over_time()
if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # # Load fake trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cont_old_calib_10My <- run_deepSTRAPP_for_focal_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_10My, max.level = 1) # ----- Plot histogram of STRAPP overall test results from run_deepSTRAPP_for_focal_time() ----- # histogram_ggplot <- plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_10My, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histogram_10My.pdf", plot_posthoc_tests = FALSE) # Adjust aesthetics a posteriori histogram_ggplot_adj <- histogram_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) # ----- Plot histogram of STRAPP overall test results from run_deepSTRAPP_over_time() ----- # ## Load directly outputs from run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Select focal_time = 10My focal_time <- 10 ## Plot histogram for overall test histogram_ggplot <- plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, focal_time = focal_time, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histogram_10My.pdf", plot_posthoc_tests = FALSE) # Adjust aesthetics a posteriori histogram_ggplot_adj <- histogram_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) # ----- Example 2: Categorical trait ----- # ## Load data # Load phylogeny data(Ponerinae_tree, package = "deepSTRAPP") # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_states = colors_per_states, evolutionary_models = "ARD", # Use default ARD model nb_simulations = 100, # Reduce number of simulations to save time seed = 1234, # Seet seed for reproducibility return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, max.level = 1) # ------ Plot histogram of STRAPP overall test results ------ # histogram_ggplot <- plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histogram_overall_test.pdf", plot_posthoc_tests = FALSE) # Adjust aesthetics a posteriori histogram_ggplot_adj <- histogram_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) # ------ Plot histograms of STRAPP post hoc test results ------ # histograms_ggplot_list <- plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histograms_posthoc_tests.pdf", plot_posthoc_tests = TRUE) # Plot all histograms one by one print(histograms_ggplot_list) # Plot all histograms on one faceted plot cowplot::plot_grid(plotlist = histograms_ggplot_list) }if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # # Load fake trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cont_old_calib_10My <- run_deepSTRAPP_for_focal_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_10My, max.level = 1) # ----- Plot histogram of STRAPP overall test results from run_deepSTRAPP_for_focal_time() ----- # histogram_ggplot <- plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_10My, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histogram_10My.pdf", plot_posthoc_tests = FALSE) # Adjust aesthetics a posteriori histogram_ggplot_adj <- histogram_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) # ----- Plot histogram of STRAPP overall test results from run_deepSTRAPP_over_time() ----- # ## Load directly outputs from run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Select focal_time = 10My focal_time <- 10 ## Plot histogram for overall test histogram_ggplot <- plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, focal_time = focal_time, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histogram_10My.pdf", plot_posthoc_tests = FALSE) # Adjust aesthetics a posteriori histogram_ggplot_adj <- histogram_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) # ----- Example 2: Categorical trait ----- # ## Load data # Load phylogeny data(Ponerinae_tree, package = "deepSTRAPP") # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_states = colors_per_states, evolutionary_models = "ARD", # Use default ARD model nb_simulations = 100, # Reduce number of simulations to save time seed = 1234, # Seet seed for reproducibility return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, max.level = 1) # ------ Plot histogram of STRAPP overall test results ------ # histogram_ggplot <- plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histogram_overall_test.pdf", plot_posthoc_tests = FALSE) # Adjust aesthetics a posteriori histogram_ggplot_adj <- histogram_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) # ------ Plot histograms of STRAPP post hoc test results ------ # histograms_ggplot_list <- plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histograms_posthoc_tests.pdf", plot_posthoc_tests = TRUE) # Plot all histograms one by one print(histograms_ggplot_list) # Plot all histograms on one faceted plot cowplot::plot_grid(plotlist = histograms_ggplot_list) }
Plot an histogram of the distribution of the test statistics
obtained from a deepSTRAPP workflow carried out for each focal time in $time_steps.
Main input = output of a deepSTRAPP run over time using run_deepSTRAPP_over_time()).
Returns one histogram for overall tests for each focal time in $time_steps.
If plot_posthoc_tests = TRUE, it will return one faceted plot with an histogram
per post hoc tests for each focal time in $time_steps.
If a PDF file path is provided in PDF_file_path, the plots will be saved directly in a PDF file,
with one page per focal time in $time_steps.
plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs, display_plots = TRUE, plot_posthoc_tests = FALSE, PDF_file_path = NULL )plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs, display_plots = TRUE, plot_posthoc_tests = FALSE, PDF_file_path = NULL )
deepSTRAPP_outputs |
List of elements generated with |
display_plots |
Logical. Whether to display the histograms generated in the R console. Default is |
plot_posthoc_tests |
Logical. For multinominal data only. Whether to plot the histograms for the overall Kruskal-Wallis test across all states ( |
PDF_file_path |
Character string. If provided, the plots will be saved in a unique PDF file following the path provided here. The path must end with ".pdf".
Each page of the PDF corresponds to a focal time in |
The main input deepSTRAPP_outputs is the typical output of run_deepSTRAPP_over_time().
It provides information on results of a STRAPP tests performed over multiple time-steps.
Histograms are built based on the distribution of the test statistics.
Such distributions are recorded in the outputs of a deepSTRAPP run carried out with run_deepSTRAPP_over_time()
when return_STRAPP_results = TRUE AND return_perm_data = TRUE. The $STRAPP_results_over_time objects provided within the input are lists that must contain
a $perm_data_df element that summarizes test statistics computed across posterior samples.
For multinominal data (categorical or biogeographic data with more than 2 states), it is possible to plot the histograms of post hoc pairwise tests.
Set plot_posthoc_tests = TRUE to generate histograms for all the pairwise post hoc Dunn's test across pairs of states.
To achieve this, the $STRAPP_results_over_time objects must contain a $posthoc_pairwise_tests$perm_data_array element that summarizes test statistics
computed across posterior samples for all pairwise post hoc tests. This is obtained from run_deepSTRAPP_over_time() when setting
return_STRAPP_results = TRUE to return the STRAPP results, posthoc_pairwise_tests = TRUE to carry out post hoc tests,
and return_perm_data = TRUE to record distributions of test statistics.
Time-steps for which the data do not yield more than two states/ranges will show a warning and generate no plot.
By default, the function returns a list of sub-lists of classes gg and ggplot ordered as in $time_steps.
Each sub-list corresponds to a ggplot for a given focal_time i that can be displayed on the console with print(output[[i]]).
If display_plots = TRUE, the histograms are being displayed on the console one by one while generated.
If using multinominal data and set plot_posthoc_tests = TRUE, the function will return a list of sub-lists of objects ordered as in $time_steps.
Each sub-list is a list of the ggplots associated with pairwise post hoc tests carried out for this a given focal_time.
For a given focal_time i, to plot each histogram j individually, use print(output_list[[i]][[j]]).
To plot all histograms of a given focal_time i at once in a multifaceted plot, as displayed sequentially on the console if display_plots = TRUE,
use cowplot::plot_grid(plotlist = output_list[[i]]).
Each plot also displays summary statistics for the STRAPP test associated with the data displayed.
The quantile of null statistic distribution at the significant threshold used to define test significance. This is the value found on the red dashed line. The test will be considered significant (i.e., the null hypothesis is rejected) if this value is higher than zero (the black dashed line).
The p-value of the STRAPP test which correspond the proportion of cases in which the statistics was lower than expected under the null hypothesis (i.e., the proportion of the histogram found below / on the left-side of the black dashed line).
If a PDF_file_path is provided, the function will also generate a PDF file of the plots with one page per $time_steps.
For post hoc tests, this will save the multifaceted plots.
Maël Doré
Associated functions in deepSTRAPP: run_deepSTRAPP_over_time() plot_histogram_STRAPP_test_for_focal_time()
if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # # Load fake trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) ## Run deepSTRAPP on net diversification rates Ponerinae_deepSTRAPP_cont_old_calib_0_40 <- run_deepSTRAPP_over_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly trait data output data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Plot histograms of STRAPP overall test results # Tests are Spearman's rank correlation tests # Plot all histograms histogram_ggplots <- plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histogram_overall_test.pdf", plot_posthoc_tests = FALSE) # Print histogram for time step 1 = 0 My print(histogram_ggplots[[1]]) # Adjust aesthetics of plot for time step 1 a posteriori histogram_ggplot_adj <- histogram_ggplots[[1]] + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) # ----- Example 2: Categorical data ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = "ARD", nb_simulations = 100, return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly trait data output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates across time-steps. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, rate_type = "net_diversification", seed = 1234, # Set for reproducibility alpha = 0.10, # Select a generous level of significance for the sake of the example posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly deepSTRAPP output data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, max.level = 1) ## Plot histograms of STRAPP overall test results # # Tests are Kruskall-Wallis H tests when more than two states/ranges are present. # Tests are Mann–Whitney–Wilcoxon rank-sum tests when only two states/ranges are present. histogram_ggplots <- plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histograms_overall_tests.pdf", plot_posthoc_tests = FALSE) # Print histogram for time step 1 = 0 My print(histogram_ggplots[[1]]) # Adjust aesthetics of plot for time step 1 a posteriori histogram_ggplot_adj <- histogram_ggplots[[1]] + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) ## Plot histograms of STRAPP post hoc test results ------ # # Tests are Dunn's multiple comparison pairwise post hoc tests possible # only when more than two states/ranges are present. histograms_ggplots_list <- plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histograms_posthoc_tests.pdf", plot_posthoc_tests = TRUE) # Print all histograms for time step 1 (= 0 My) one by one print(histograms_ggplots_list[[1]]) # Plot all histograms for time step 1 (= 0 My) on one faceted plot cowplot::plot_grid(plotlist = histograms_ggplots_list[[1]]) }if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # # Load fake trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) ## Run deepSTRAPP on net diversification rates Ponerinae_deepSTRAPP_cont_old_calib_0_40 <- run_deepSTRAPP_over_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly trait data output data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Plot histograms of STRAPP overall test results # Tests are Spearman's rank correlation tests # Plot all histograms histogram_ggplots <- plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histogram_overall_test.pdf", plot_posthoc_tests = FALSE) # Print histogram for time step 1 = 0 My print(histogram_ggplots[[1]]) # Adjust aesthetics of plot for time step 1 a posteriori histogram_ggplot_adj <- histogram_ggplots[[1]] + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) # ----- Example 2: Categorical data ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = "ARD", nb_simulations = 100, return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly trait data output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates across time-steps. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, rate_type = "net_diversification", seed = 1234, # Set for reproducibility alpha = 0.10, # Select a generous level of significance for the sake of the example posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly deepSTRAPP output data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, max.level = 1) ## Plot histograms of STRAPP overall test results # # Tests are Kruskall-Wallis H tests when more than two states/ranges are present. # Tests are Mann–Whitney–Wilcoxon rank-sum tests when only two states/ranges are present. histogram_ggplots <- plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histograms_overall_tests.pdf", plot_posthoc_tests = FALSE) # Print histogram for time step 1 = 0 My print(histogram_ggplots[[1]]) # Adjust aesthetics of plot for time step 1 a posteriori histogram_ggplot_adj <- histogram_ggplots[[1]] + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(histogram_ggplot_adj) ## Plot histograms of STRAPP post hoc test results ------ # # Tests are Dunn's multiple comparison pairwise post hoc tests possible # only when more than two states/ranges are present. histograms_ggplots_list <- plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, display_plot = TRUE, # PDF_file_path = "./plot_STRAPP_histograms_posthoc_tests.pdf", plot_posthoc_tests = TRUE) # Print all histograms for time step 1 (= 0 My) one by one print(histograms_ggplots_list[[1]]) # Plot all histograms for time step 1 (= 0 My) on one faceted plot cowplot::plot_grid(plotlist = histograms_ggplots_list[[1]]) }
Plot the evolution of diversification rates in relation to trait values
extracted for multiple time_steps with run_deepSTRAPP_over_time().
Rates are averaged across branches at each time step (i.e., focal_time).
For continuous data, branches are grouped by ranges of trait values defined by quantile_ranges.
For categorical data, branches are grouped by trait states.
For biogeographic data, branches are grouped by ranges.
plot_rates_through_time( deepSTRAPP_outputs, rate_type = "net_diversification", quantile_ranges = c(0, 0.25, 0.5, 0.75, 1), select_trait_levels = "all", time_range = NULL, color_scale = NULL, colors_per_levels = NULL, plot_CI = FALSE, CI_type = "fuzzy", CI_quantiles = 0.95, display_plot = TRUE, PDF_file_path = NULL, return_mean_data_per_samples_df = FALSE, return_median_data_across_samples_df = FALSE )plot_rates_through_time( deepSTRAPP_outputs, rate_type = "net_diversification", quantile_ranges = c(0, 0.25, 0.5, 0.75, 1), select_trait_levels = "all", time_range = NULL, color_scale = NULL, colors_per_levels = NULL, plot_CI = FALSE, CI_type = "fuzzy", CI_quantiles = 0.95, display_plot = TRUE, PDF_file_path = NULL, return_mean_data_per_samples_df = FALSE, return_median_data_across_samples_df = FALSE )
deepSTRAPP_outputs |
List of elements generated with |
rate_type |
A character string specifying the type of diversification rates to use.
Must be one of 'speciation', 'extinction' or 'net_diversification' (default).
Even if the |
quantile_ranges |
Vector of numerical. Only for continuous trait data. Quantiles used as thresholds to group branches
by trait values. It must start with 0 and finish with 1. Default is |
select_trait_levels |
(Vector of) character string. Only for categorical and biogeographic trait data.
To provide a list of a subset of states/ranges to plot. Names must match the ones found in
|
time_range |
Vector of two numerical values. Time boundaries used for the plot.
If |
color_scale |
Vector of character string. List of colors to use to build the color scale with |
colors_per_levels |
Named character string. To set the colors to use to plot rates of each state/range. Names = states/ranges; values = colors.
If |
plot_CI |
Logical. Whether to plot a confidence interval (CI) based on the distribution of rates found in posterior samples. Default is |
CI_type |
Character string. To select the type of confidence interval (CI) to plot.
|
CI_quantiles |
Numerical. Proportion of rate values across posterior samples encompassed by the confidence interval. Only if |
display_plot |
Logical. Whether to display the plot generated in the R console. Default is |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
return_mean_data_per_samples_df |
Logical. Whether to include in the output the data.frame of mean rates per trait values computed for
each posterior sample at each time-step (aggregated across groups of branches based on trait data). This is used to draw the confidence interval. Default is |
return_median_data_across_samples_df |
Logical. Whether to include in the output the data.frame of median rates per trait values
across posterior samples computed for at each time-step (aggregated across groups of branches based on trait data AND posterior samples).
This is used to draw the lines on the plot. Default is |
The function returns a list with at least one element.
rates_TT_ggplot An object of classes gg and ggplot. This is a ggplot that can be displayed
on the console with print(output$rates_TT_ggplot). It corresponds to the plot being displayed on the console
when the function is run, if display_plot = TRUE, and can be further modify for aesthetics using the ggplot2 grammar.
Optional summary data frames:
mean_data_per_samples_df A data.frame with four columns providing the $mean_rates observed along branches
with a similar $trait_value (if categorical or biogeographic) or falling into the same $quantile_ranges.
Data are extracted for each posterior sample ($BAMM_sample_ID) at each time-step (i.e., $focal_time).
This is used to draw the confidence interval. Included if return_mean_data_per_samples_df = TRUE.
$median_data_across_samples_df A data.frame with three columns providing the $median_rates
observed across all posterior samples in $mean_data_per_samples_df. This is used to draw the lines on the plot.
Included if return_median_data_across_samples_df = TRUE.
If a PDF_file_path is provided, the function will also generate a PDF file of the plot.
Maël Doré
For a guided tutorial, see this vignette: vignette("plot_rates_through_time", package = "deepSTRAPP")
# ------ Example 1: Plot rates through time for continuous data ------ # if (deepSTRAPP::is_dev_version()) { ## Load results of run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Visualize trait data hist(Ponerinae_deepSTRAPP_cont_old_calib_0_40$trait_data_df_over_time$trait_value, xlab = "Trait values", main = NULL) # Generate plot plotTT_continuous <- plot_rates_through_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, quantile_ranges = c(0, 0.25, 0.5, 0.75, 1.0), time_range = c(0, 50), # Control range of the X-axis # color_scale = c("limegreen", "red"), plot_CI = TRUE, CI_type = "quantiles_rect", CI_quantiles = 0.9, display_plot = FALSE, # PDF_file_path = "./plotTT_continuous.pdf", return_mean_data_per_samples_df = TRUE, return_median_data_across_samples_df = TRUE) # Explore output # str(plotTT_continuous, max.level = 1) # Plot print(plotTT_continuous$rates_TT_ggplot) # Adjust aesthetics of plot a posteriori plotTT_continuous_adj <- plotTT_continuous$rates_TT_ggplot + ggplot2::theme( plot.title = ggplot2::element_text(color = "red", size = 15), axis.title = ggplot2::element_text(size = 14), axis.text = ggplot2::element_text(size = 12)) # Plot again print(plotTT_continuous_adj) } # ------ Example 2: Plot rates through time for categorical data ------ # if (deepSTRAPP::is_dev_version()) { ## Load results of run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore trait data table(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40$trait_data_df_over_time$trait_value) # Set colors to use colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # Generate plot only for "arboreal" and "terricolous" plotTT_categorical <- plot_rates_through_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, select_trait_levels = c("arboreal", "terricolous"), time_range = c(0, 50), colors_per_levels = colors_per_states, plot_CI = TRUE, CI_type = "quantiles_rect", CI_quantiles = 0.9, display_plot = FALSE, # PDF_file_path = "./plotTT_categorical.pdf", return_mean_data_per_samples_df = TRUE, return_median_data_across_samples_df = TRUE) # Explore output # str(plotTT_categorical, max.level = 1) # Adjust aesthetics of plot a posteriori plotTT_categorical_adj <- plotTT_categorical$rates_TT_ggplot + ggplot2::theme( plot.title = ggplot2::element_text(size = 15), axis.title = ggplot2::element_text(size = 14), axis.text = ggplot2::element_text(size = 12)) print(plotTT_categorical_adj) } # ------ Example 3: Plot rates through time for biogeographic data ------ # if (deepSTRAPP::is_dev_version()) { ## Load results of run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore range data table(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40$trait_data_df_over_time$trait_value) # Set colors to use colors_per_ranges <- c("mediumpurple2", "peachpuff2") names(colors_per_ranges) <- c("N", "O") plotTT_biogeographic <- plot_rates_through_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, select_trait_levels = "all", time_range = c(0, 50), colors_per_levels = colors_per_ranges, plot_CI = TRUE, CI_type = "quantiles_rect", CI_quantiles = 0.9, display_plot = FALSE, # PDF_file_path = "./plotTT_biogeographic.pdf", return_mean_data_per_samples_df = TRUE, return_median_data_across_samples_df = TRUE) # Explore output # str(plotTT_biogeographic, max.level = 1) # Adjust aesthetics of plot a posteriori plotTT_biogeographic_adj <- plotTT_biogeographic$rates_TT_ggplot + ggplot2::theme( plot.title = ggplot2::element_text(size = 15), axis.title = ggplot2::element_text(size = 14), axis.text = ggplot2::element_text(size = 12)) print(plotTT_biogeographic_adj) }# ------ Example 1: Plot rates through time for continuous data ------ # if (deepSTRAPP::is_dev_version()) { ## Load results of run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Visualize trait data hist(Ponerinae_deepSTRAPP_cont_old_calib_0_40$trait_data_df_over_time$trait_value, xlab = "Trait values", main = NULL) # Generate plot plotTT_continuous <- plot_rates_through_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, quantile_ranges = c(0, 0.25, 0.5, 0.75, 1.0), time_range = c(0, 50), # Control range of the X-axis # color_scale = c("limegreen", "red"), plot_CI = TRUE, CI_type = "quantiles_rect", CI_quantiles = 0.9, display_plot = FALSE, # PDF_file_path = "./plotTT_continuous.pdf", return_mean_data_per_samples_df = TRUE, return_median_data_across_samples_df = TRUE) # Explore output # str(plotTT_continuous, max.level = 1) # Plot print(plotTT_continuous$rates_TT_ggplot) # Adjust aesthetics of plot a posteriori plotTT_continuous_adj <- plotTT_continuous$rates_TT_ggplot + ggplot2::theme( plot.title = ggplot2::element_text(color = "red", size = 15), axis.title = ggplot2::element_text(size = 14), axis.text = ggplot2::element_text(size = 12)) # Plot again print(plotTT_continuous_adj) } # ------ Example 2: Plot rates through time for categorical data ------ # if (deepSTRAPP::is_dev_version()) { ## Load results of run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore trait data table(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40$trait_data_df_over_time$trait_value) # Set colors to use colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # Generate plot only for "arboreal" and "terricolous" plotTT_categorical <- plot_rates_through_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, select_trait_levels = c("arboreal", "terricolous"), time_range = c(0, 50), colors_per_levels = colors_per_states, plot_CI = TRUE, CI_type = "quantiles_rect", CI_quantiles = 0.9, display_plot = FALSE, # PDF_file_path = "./plotTT_categorical.pdf", return_mean_data_per_samples_df = TRUE, return_median_data_across_samples_df = TRUE) # Explore output # str(plotTT_categorical, max.level = 1) # Adjust aesthetics of plot a posteriori plotTT_categorical_adj <- plotTT_categorical$rates_TT_ggplot + ggplot2::theme( plot.title = ggplot2::element_text(size = 15), axis.title = ggplot2::element_text(size = 14), axis.text = ggplot2::element_text(size = 12)) print(plotTT_categorical_adj) } # ------ Example 3: Plot rates through time for biogeographic data ------ # if (deepSTRAPP::is_dev_version()) { ## Load results of run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore range data table(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40$trait_data_df_over_time$trait_value) # Set colors to use colors_per_ranges <- c("mediumpurple2", "peachpuff2") names(colors_per_ranges) <- c("N", "O") plotTT_biogeographic <- plot_rates_through_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, select_trait_levels = "all", time_range = c(0, 50), colors_per_levels = colors_per_ranges, plot_CI = TRUE, CI_type = "quantiles_rect", CI_quantiles = 0.9, display_plot = FALSE, # PDF_file_path = "./plotTT_biogeographic.pdf", return_mean_data_per_samples_df = TRUE, return_median_data_across_samples_df = TRUE) # Explore output # str(plotTT_biogeographic, max.level = 1) # Adjust aesthetics of plot a posteriori plotTT_biogeographic_adj <- plotTT_biogeographic$rates_TT_ggplot + ggplot2::theme( plot.title = ggplot2::element_text(size = 15), axis.title = ggplot2::element_text(size = 14), axis.text = ggplot2::element_text(size = 12)) print(plotTT_biogeographic_adj) }
Plot rates vs. trait data as extracted for a given focal time.
Data are extracted from the output of a deepSTRAPP run carried out with
run_deepSTRAPP_for_focal_time() or
run_deepSTRAPP_over_time()).
Returns a single plot showing rates vs. trait data for a given focal time. If the trait data are 'continuous', the plot is a scatter plot. If the trait data are 'categorical' or 'biogeographic', the plot is a boxplot.
If a PDF file path is provided in PDF_file_path, the plot will be saved directly in a PDF file.
plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs, focal_time = NULL, rate_type = "net_diversification", select_trait_levels = "all", color_scale = NULL, colors_per_levels = NULL, display_plot = TRUE, PDF_file_path = NULL, return_mean_rates_vs_trait_data_df = FALSE )plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs, focal_time = NULL, rate_type = "net_diversification", select_trait_levels = "all", color_scale = NULL, colors_per_levels = NULL, display_plot = TRUE, PDF_file_path = NULL, return_mean_rates_vs_trait_data_df = FALSE )
deepSTRAPP_outputs |
List of elements generated with |
focal_time |
Numerical. (Optional) If |
rate_type |
A character string specifying the type of diversification rates to plot.
Must be one of 'speciation', 'extinction' or 'net_diversification' (default).
Even if the |
select_trait_levels |
(Vector of) character string. Only for categorical and biogeographic trait data.
To provide a list of a subset of states/ranges to plot. Names must match the ones found in the |
color_scale |
Vector of character string. List of colors to use to build the color scale with |
colors_per_levels |
Named character string. To set the colors to use to plot data points and box for each state/range. Names = states/ranges; values = colors.
If |
display_plot |
Logical. Whether to display the plot generated in the R console. Default is |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
return_mean_rates_vs_trait_data_df |
Logical. Whether to include in the output the data.frame of mean rates per trait values/states/ranges computed for
each posterior sample at the focal time. Default is |
The main input deepSTRAPP_outputs is the typical output of run_deepSTRAPP_for_focal_time().
It provides information on results of a STRAPP test performed at a given focal_time.
Plots are built based on both trait data and diversification data as extracted for the given focal_time.
Such data are recorded in the outputs of a deepSTRAPP run carried out with run_deepSTRAPP_for_focal_time()
when return_updated_trait_data_with_Map = TRUE for trait data, and extract_diversification_data_melted_df = TRUE for diversification data.
Please ensure to select those arguments when running deepSTRAPP.
Alternatively, the main input deepSTRAPP_outputs can be the output of run_deepSTRAPP_over_time(),
providing results of STRAPP tests over multiple time-steps. In this case, you must provide a focal_time to select the
unique time-step used for plotting.
return_updated_trait_data_with_Map must be set to TRUE so that the trait data used to compute the tests are returned among the outputs
under $updated_trait_data_with_Map_over_time. Alternatively, and more efficiently, extract_trait_data_melted_df can be set to TRUE
so that trait data are already returned in a melted data.frame among the outputs under $trait_data_df_over_time.
extract_diversification_data_melted_df must be set to TRUE so that the diversification rates are returned
among the outputs under $diversification_data_df_over_time.
For plotting all time-steps at once, see plot_rates_vs_trait_data_over_time().
The function returns a list with at least one element.
rates_vs_trait_ggplot An object of classes gg and ggplot. This is a ggplot that can be displayed
on the console with print(output$rates_vs_trait_ggplot). It corresponds to the plot being displayed on the console
when the function is run, if display_plot = TRUE, and can be further modify for aesthetics using the ggplot2 grammar.
If the trait data are 'continuous', the plot is a scatter plot showing how diversification rates varies with trait values. If the trait data are 'categorical' or 'biogeographic', the plot is a boxplot showing diversification rates per states/ranges.
Each plot also displays summary statistics for the STRAPP test associated with the data displayed:
An observed statistic computed across the mean traits/ranges and rates values shown on the plot. This is not the statistic of the STRAPP test itself, which is conducted across all BAMM posterior samples.
The quantile of null statistic distribution at the significant threshold used to define test significance. The test will be considered significant (i.e., the null hypothesis is rejected) if this value is higher than zero.
The p-value of the associated STRAPP test.
Optional summary data.frame:
mean_rates_vs_trait_data_df A data.frame with three columns providing the $mean_rates and $trait_value
observed along branches at focal_time. Rates are averaged across all BAMM posterior samples.
This is the raw data used to draw the plot. Included if return_mean_rates_vs_trait_data_df = TRUE.
If a PDF_file_path is provided, the function will also generate a PDF file of the plot.
Maël Doré
Associated functions in deepSTRAPP: run_deepSTRAPP_for_focal_time() plot_rates_vs_trait_data_over_time()
if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # # Load fake trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cont_old_calib_10My <- run_deepSTRAPP_for_focal_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, # Need to be set to TRUE to save diversification data extract_diversification_data_melted_df = TRUE, # Need to be set to TRUE to save trait data return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_10My, max.level = 1) # ----- Plot histogram of STRAPP overall test results from run_deepSTRAPP_for_focal_time() ----- # # Get plot rates_vs_trait_output <- plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepPonerinae_deepSTRAPP_cont_old_calib_10My, color_scale = c("grey80", "orange"), display_plot = TRUE, # PDF_file_path = "./plot_rates_vs_trait_10My.pdf" return_mean_rates_vs_trait_data_df = TRUE) # Adjust aesthetics a posteriori rates_vs_trait_ggplot_adj <- rates_vs_trait_output$rates_vs_trait_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(rates_vs_trait_ggplot_adj) # Explore melted data.frame of mean rates and trait values extracted for the given focal time. head(rates_vs_trait_output$mean_rates_vs_trait_data_df) # ----- Plot histogram of STRAPP overall test results from run_deepSTRAPP_over_time() ----- # ## Load directly outputs from run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Select focal_time = 10My focal_time <- 10 # Get plot rates_vs_trait_output <- plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, focal_time = focal_time, color_scale = c("grey80", "purple"), display_plot = TRUE) # PDF_file_path = "./plot_rates_vs_trait_10My.pdf" # Adjust aesthetics a posteriori rates_vs_trait_ggplot_adj <- rates_vs_trait_output$rates_vs_trait_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(rates_vs_trait_ggplot_adj) # ----- Example 2: Categorical trait ----- # ## Load data # Load phylogeny data(Ponerinae_tree, package = "deepSTRAPP") # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_states = colors_per_states, evolutionary_models = "ARD", # Use default ARD model nb_simulations = 100, # Reduce number of simulations to save time seed = 1234, # Seet seed for reproducibility return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, # Need to be set to TRUE to save diversification data extract_diversification_data_melted_df = TRUE, # Need to be set to TRUE to save trait data return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, max.level = 1) ## Plot rates vs. states rates_vs_trait_output <- plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, focal_time = 10, select_trait_levels = c("arboreal", "terricolous"), # Select only two levels colors_per_levels = colors_per_states[c("arboreal", "terricolous")], # Adjust colors display_plot = TRUE, # PDF_file_path = "./plot_rates_vs_trait_10My.pdf", return_mean_rates_vs_trait_data_df = TRUE) # Adjust aesthetics a posteriori rates_vs_trait_ggplot_adj <- rates_vs_trait_output$rates_vs_trait_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(rates_vs_trait_ggplot_adj) # Explore melted data.frame of mean rates and states extracted for the given focal time. head(rates_vs_trait_output$mean_rates_vs_trait_data_df) }if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # # Load fake trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cont_old_calib_10My <- run_deepSTRAPP_for_focal_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, # Need to be set to TRUE to save diversification data extract_diversification_data_melted_df = TRUE, # Need to be set to TRUE to save trait data return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_10My, max.level = 1) # ----- Plot histogram of STRAPP overall test results from run_deepSTRAPP_for_focal_time() ----- # # Get plot rates_vs_trait_output <- plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepPonerinae_deepSTRAPP_cont_old_calib_10My, color_scale = c("grey80", "orange"), display_plot = TRUE, # PDF_file_path = "./plot_rates_vs_trait_10My.pdf" return_mean_rates_vs_trait_data_df = TRUE) # Adjust aesthetics a posteriori rates_vs_trait_ggplot_adj <- rates_vs_trait_output$rates_vs_trait_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(rates_vs_trait_ggplot_adj) # Explore melted data.frame of mean rates and trait values extracted for the given focal time. head(rates_vs_trait_output$mean_rates_vs_trait_data_df) # ----- Plot histogram of STRAPP overall test results from run_deepSTRAPP_over_time() ----- # ## Load directly outputs from run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Select focal_time = 10My focal_time <- 10 # Get plot rates_vs_trait_output <- plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, focal_time = focal_time, color_scale = c("grey80", "purple"), display_plot = TRUE) # PDF_file_path = "./plot_rates_vs_trait_10My.pdf" # Adjust aesthetics a posteriori rates_vs_trait_ggplot_adj <- rates_vs_trait_output$rates_vs_trait_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(rates_vs_trait_ggplot_adj) # ----- Example 2: Categorical trait ----- # ## Load data # Load phylogeny data(Ponerinae_tree, package = "deepSTRAPP") # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_states = colors_per_states, evolutionary_models = "ARD", # Use default ARD model nb_simulations = 100, # Reduce number of simulations to save time seed = 1234, # Seet seed for reproducibility return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, # Need to be set to TRUE to save diversification data extract_diversification_data_melted_df = TRUE, # Need to be set to TRUE to save trait data return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, max.level = 1) ## Plot rates vs. states rates_vs_trait_output <- plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_10My, focal_time = 10, select_trait_levels = c("arboreal", "terricolous"), # Select only two levels colors_per_levels = colors_per_states[c("arboreal", "terricolous")], # Adjust colors display_plot = TRUE, # PDF_file_path = "./plot_rates_vs_trait_10My.pdf", return_mean_rates_vs_trait_data_df = TRUE) # Adjust aesthetics a posteriori rates_vs_trait_ggplot_adj <- rates_vs_trait_output$rates_vs_trait_ggplot + ggplot2::theme(plot.title = ggplot2::element_text(color = "red", size = 15)) print(rates_vs_trait_ggplot_adj) # Explore melted data.frame of mean rates and states extracted for the given focal time. head(rates_vs_trait_output$mean_rates_vs_trait_data_df) }
Plot rates vs. trait data as extracted for all trait_data_dftime-steps.
Data are extracted from the output of a deepSTRAPP run carried out with
run_deepSTRAPP_over_time()) over multiple time-steps.
For each time-step, returns a plot showing rates vs. trait data. If the trait data are 'continuous', the plot is a scatter plot. If the trait data are 'categorical' or 'biogeographic', the plot is a boxplot.
If a PDF file path is provided in PDF_file_path, all plots will be saved directly in a unique PDF file,
with one page per plot/time-step.
plot_rates_vs_trait_data_over_time( deepSTRAPP_outputs, rate_type = "net_diversification", select_trait_levels = "all", color_scale = NULL, colors_per_levels = NULL, display_plot = TRUE, PDF_file_path = NULL, return_mean_rates_vs_trait_data_df = FALSE )plot_rates_vs_trait_data_over_time( deepSTRAPP_outputs, rate_type = "net_diversification", select_trait_levels = "all", color_scale = NULL, colors_per_levels = NULL, display_plot = TRUE, PDF_file_path = NULL, return_mean_rates_vs_trait_data_df = FALSE )
deepSTRAPP_outputs |
List of elements generated with |
rate_type |
A character string specifying the type of diversification rates to plot.
Must be one of 'speciation', 'extinction' or 'net_diversification' (default).
Even if the |
select_trait_levels |
(Vector of) character string. Only for categorical and biogeographic trait data.
To provide a list of a subset of states/ranges to plot. Names must match the ones found in the |
color_scale |
Vector of character string. List of colors to use to build the color scale with |
colors_per_levels |
Named character string. To set the colors to use to plot data points and box for each state/range. Names = states/ranges; values = colors.
If |
display_plot |
Logical. Whether to display the plot generated in the R console. Default is |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
return_mean_rates_vs_trait_data_df |
Logical. Whether to include in the output the data.frame of mean rates per trait values/states/ranges computed for
each posterior sample over all time-steps. Default is |
The main input deepSTRAPP_outputs is the typical output of run_deepSTRAPP_over_time().
It provides information on results of a STRAPP test performed over multiple time-steps.
Plots are built based on both trait data and diversification data as extracted for the time-steps.
Such data are recorded in the outputs of a deepSTRAPP run carried out with run_deepSTRAPP_over_time().
return_updated_trait_data_with_Map must be set to TRUE so that the trait data used to compute the tests are returned among the outputs
under $updated_trait_data_with_Map_over_time. Alternatively, and more efficiently, extract_trait_data_melted_df can be set to TRUE
so that trait data are already returned in a melted data.frame among the outputs under $trait_data_df_over_time.
extract_diversification_data_melted_df must be set to TRUE so that the diversification rates are returned
among the outputs under $diversification_data_df_over_time.
For plotting a single focal_time, see plot_rates_vs_trait_data_for_focal_time().
The function returns a list with at least one element.
rates_vs_trait_ggplots A list of objects of classes gg and ggplot ordered as in $time_steps.
Each element corresponds to a ggplot for a given focal_time. They can be displayed on the console with print(output$rates_vs_trait_ggplots[[i]]).
They correspond to the plots being displayed on the console one by one when the function is run, if display_plot = TRUE,
and can be further modify for aesthetics using the ggplot2 grammar.
If the trait data are 'continuous', the plots are scatter plots showing how diversification rates varies with trait values. If the trait data are 'categorical' or 'biogeographic', the plots are boxplots showing diversification rates per states/ranges.
Each plot also displays summary statistics for the STRAPP test associated with the data displayed:
An observed statistic computed across the mean traits/ranges and rates values shown on the plot. This is not the statistic of the STRAPP test itself, which is conducted across all BAMM posterior samples.
The quantile of null statistic distribution at the significant threshold used to define test significance. The test will be considered significant (i.e., the null hypothesis is rejected) if this value is higher than zero.
The p-value of the associated STRAPP test.
Optional summary data.frame:
mean_rates_vs_trait_data_df A data.frame with three columns providing the $mean_rates and $trait_value
observed along branches at the different focal_time. Rates are averaged across all BAMM posterior samples.
This is the raw data used to draw each plot for each focal_time. Included if return_mean_rates_vs_trait_data_df = TRUE.
If a PDF_file_path is provided, the function will also generate a unique PDF file with one plot/page per $time_steps.
Maël Doré
Associated functions in deepSTRAPP: run_deepSTRAPP_over_time() plot_rates_vs_trait_data_for_focal_time()
if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # # Load fake trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) ## Run deepSTRAPP on net diversification rates Ponerinae_deepSTRAPP_cont_old_calib_0_40 <- run_deepSTRAPP_over_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, return_perm_data = TRUE, # Need to be set to TRUE to save trait data extract_trait_data_melted_df = TRUE, # Need to be set to TRUE to save diversification data extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly trait data output data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_0_40, max.level = 1) ## Plot for all time-steps rates_vs_trait_outputs <- plot_rates_vs_trait_data_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, color_scale = c("grey80", "purple"), # Adjust color scale display_plot = TRUE, # PDF_file_path = "./plot_rates_vs_trait_0_40My.pdf", return_mean_rates_vs_trait_data_df = TRUE) ## Print plot for time step 3 = 10 My print(rates_vs_trait_outputs$rates_vs_trait_ggplots[[3]]) ## Explore melted data.frame of rates and trait data head(rates_vs_trait_outputs$mean_rates_vs_trait_data_df) # ----- Example 2: Categorical data ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = "ARD", nb_simulations = 100, return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly trait data output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates across time-steps. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, rate_type = "net_diversification", seed = 1234, # Set for reproducibility alpha = 0.10, # Select a generous level of significance for the sake of the example posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, # Need to be set to TRUE to save trait data extract_trait_data_melted_df = TRUE, # Need to be set to TRUE to save diversification data extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly deepSTRAPP output data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore output str(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, max.level = 1) # Adjust color scheme colors_per_states <- c("orange", "dodgerblue", "red") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") ## Plot for all time-steps rates_vs_trait_outputs <- plot_rates_vs_trait_data_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, colors_per_levels = colors_per_states, # Adjust color scheme display_plot = TRUE, # PDF_file_path = "./plot_rates_vs_trait_0_40My.pdf", return_mean_rates_vs_trait_data_df = TRUE) ## Print plot for time step 3 = 10 My print(rates_vs_trait_outputs$rates_vs_trait_ggplots[[3]]) ## Explore melted data.frame of rates and states head(rates_vs_trait_outputs$mean_rates_vs_trait_data_df) }if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # # Load fake trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) ## Run deepSTRAPP on net diversification rates Ponerinae_deepSTRAPP_cont_old_calib_0_40 <- run_deepSTRAPP_over_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, return_perm_data = TRUE, # Need to be set to TRUE to save trait data extract_trait_data_melted_df = TRUE, # Need to be set to TRUE to save diversification data extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly trait data output data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_0_40, max.level = 1) ## Plot for all time-steps rates_vs_trait_outputs <- plot_rates_vs_trait_data_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, color_scale = c("grey80", "purple"), # Adjust color scale display_plot = TRUE, # PDF_file_path = "./plot_rates_vs_trait_0_40My.pdf", return_mean_rates_vs_trait_data_df = TRUE) ## Print plot for time step 3 = 10 My print(rates_vs_trait_outputs$rates_vs_trait_ggplots[[3]]) ## Explore melted data.frame of rates and trait data head(rates_vs_trait_outputs$mean_rates_vs_trait_data_df) # ----- Example 2: Categorical data ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = "ARD", nb_simulations = 100, return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly trait data output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates across time-steps. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, rate_type = "net_diversification", seed = 1234, # Set for reproducibility alpha = 0.10, # Select a generous level of significance for the sake of the example posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, # Need to be set to TRUE to save trait data extract_trait_data_melted_df = TRUE, # Need to be set to TRUE to save diversification data extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly deepSTRAPP output data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore output str(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, max.level = 1) # Adjust color scheme colors_per_states <- c("orange", "dodgerblue", "red") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") ## Plot for all time-steps rates_vs_trait_outputs <- plot_rates_vs_trait_data_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, colors_per_levels = colors_per_states, # Adjust color scheme display_plot = TRUE, # PDF_file_path = "./plot_rates_vs_trait_0_40My.pdf", return_mean_rates_vs_trait_data_df = TRUE) ## Print plot for time step 3 = 10 My print(rates_vs_trait_outputs$rates_vs_trait_ggplots[[3]]) ## Explore melted data.frame of rates and states head(rates_vs_trait_outputs$mean_rates_vs_trait_data_df) }
Plot the evolution of the p-values of STRAPP tests
carried out for across multiple time_steps, obtained from
run_deepSTRAPP_over_time().
By default, return a plot with a single line for p-values of overall tests.
If plot_posthoc_tests = TRUE, it will return a plot with multiple lines, one per pair in post hoc tests
(only for multinominal data, with more than two states).
If a PDF file path is provided in PDF_file_path, the plot will be saved directly in a PDF file.
plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs, time_range = NULL, pvalues_max = NULL, alpha = 0.05, display_plot = TRUE, plot_significant_time_frame = TRUE, plot_posthoc_tests = FALSE, select_posthoc_pairs = "all", plot_adjusted_pvalues = FALSE, PDF_file_path = NULL )plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs, time_range = NULL, pvalues_max = NULL, alpha = 0.05, display_plot = TRUE, plot_significant_time_frame = TRUE, plot_posthoc_tests = FALSE, select_posthoc_pairs = "all", plot_adjusted_pvalues = FALSE, PDF_file_path = NULL )
deepSTRAPP_outputs |
List of elements generated with |
time_range |
Vector of two numerical values. Time boundaries used for X-axis the plot.
If |
pvalues_max |
Numerical. Set the max boundary used for the Y-axis of the plot.
If |
alpha |
Numerical. Significance level to display as a red dashed line on the plot. If set to |
display_plot |
Logical. Whether to display the plot generated in the R console. Default is |
plot_significant_time_frame |
Logical. Whether to display a green band over the time frame that yields significant results according to the chosen alpha level. Default is |
plot_posthoc_tests |
Logical. For multinominal data only. Whether to plot the p-values for the overall Kruskal-Wallis test across all states ( |
select_posthoc_pairs |
Vector of character strings used to specify the pairs to include in the plot. Names of pairs must match the pairs found in
|
plot_adjusted_pvalues |
Logical. Whether to display the p-values adjusted for multiple testing rather than the raw p-values.
See argument 'p.adjust_method' in |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
Plots are build based on the p-values recorded in summary_df provided by run_deepSTRAPP_over_time().
For overall tests, those p-values are found in $pvalues_summary_df.
For multinominal data (categorical or biogeographic data with more than 2 states), it is possible to plot p-values of post hoc pairwise tests.
Set plot_posthoc_tests = TRUE to generate plots for the pairwise post hoc Dunn's test across pairs of states.
To achieve this, the deepSTRAPP_outputs input object must contain a $pvalues_summary_df_for_posthoc_pairwise_tests element that summarizes p-values
computed across pairs of states for all post hoc tests. This is obtained from run_deepSTRAPP_over_time() when setting
posthoc_pairwise_tests = TRUE to carry out post hoc tests.
The function returns a list of classes gg and ggplot.
This object is a ggplot that can be displayed on the console with print(output).
It corresponds to the plot being displayed on the console when the function is run, if display_plot = TRUE,
and can be further modify for aesthetics using the ggplot2 grammar.
If a PDF_file_path is provided, the function will also generate a PDF file of the plot.
Maël Doré
if (deepSTRAPP::is_dev_version()) { ## Load results of run_deepSTRAPP_over_time() for categorical data with 3-levels data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Plot results of overall Kruskal-Wallis / Mann-Whitney-Wilcoxon tests across all time-steps plot_overall <- plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, alpha = 0.10, time_range = c(0, 30), # Adjust time range if needed display_plot = FALSE) # Adjust aesthetics a posteriori plot_overall <- plot_overall + ggplot2::theme( plot.title = ggplot2::element_text(size = 16)) print(plot_overall) ## Plot results of post hoc pairwise Dunn's tests between selected pairs of states plot_posthoc <- plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, alpha = 0.10, plot_posthoc_tests = TRUE, # PDF_file_path = "./pvalues_over_time.pdf", select_posthoc_pairs = c("arboreal != subterranean", "arboreal != terricolous"), display_plot = FALSE) # Adjust aesthetics a posteriori plot_posthoc <- plot_posthoc + ggplot2::theme( plot.title = ggplot2::element_text(size = 16), legend.title = ggplot2::element_text(size = 14), legend.position.inside = c(0.25, 0.25)) print(plot_posthoc) }if (deepSTRAPP::is_dev_version()) { ## Load results of run_deepSTRAPP_over_time() for categorical data with 3-levels data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Plot results of overall Kruskal-Wallis / Mann-Whitney-Wilcoxon tests across all time-steps plot_overall <- plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, alpha = 0.10, time_range = c(0, 30), # Adjust time range if needed display_plot = FALSE) # Adjust aesthetics a posteriori plot_overall <- plot_overall + ggplot2::theme( plot.title = ggplot2::element_text(size = 16)) print(plot_overall) ## Plot results of post hoc pairwise Dunn's tests between selected pairs of states plot_posthoc <- plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, alpha = 0.10, plot_posthoc_tests = TRUE, # PDF_file_path = "./pvalues_over_time.pdf", select_posthoc_pairs = c("arboreal != subterranean", "arboreal != terricolous"), display_plot = FALSE) # Adjust aesthetics a posteriori plot_posthoc <- plot_posthoc + ggplot2::theme( plot.title = ggplot2::element_text(size = 16), legend.title = ggplot2::element_text(size = 14), legend.position.inside = c(0.25, 0.25)) print(plot_posthoc) }
Plot two mapped phylogenies with evolutionary data with branches cut off at focal_time.
Left facet: plot the evolution of trait data/geographic ranges on the left time-calibrated phylogeny.
For continuous data: Each branch is colored according to the estimates value of the traits.
For categorical and biogeographic data: Each branch is colored according to the posterior probability of being in a given state/range. Color for each state/range are overlaid using transparency to produce a single plot for all states/ranges.
Right facet: plot the evolution of diversification rates and location of regime shits estimated from a BAMM
(Bayesian Analysis of Macroevolutionary Mixtures).
Each branch is colored according to the estimated rates of speciation, extinction, or net diversification
stored in an object of class bammdata. Rates can vary along time, thus colors evolved along individual branches.
This function is a wrapper multiple plotting functions:
For continuous traits: plot_contMap()
For categorical and biogeographic data: plot_densityMaps_overlay()
For BAMM rates and regime shifts: plot_BAMM_rates()
plot_traits_vs_rates_on_phylogeny_for_focal_time( deepSTRAPP_outputs, focal_time = NULL, color_scale = NULL, colors_per_levels = NULL, add_ACE_pies = TRUE, cex_pies = 0.5, rate_type = "net_diversification", keep_initial_colorbreaks = FALSE, add_regime_shifts = TRUE, configuration_type = "MAP", sample_index = 1, adjust_size_to_prob = TRUE, regimes_fill = "grey", regimes_size = 1, regimes_pch = 21, regimes_border_col = "black", regimes_border_width = 1, ..., display_plot = TRUE, PDF_file_path = NULL )plot_traits_vs_rates_on_phylogeny_for_focal_time( deepSTRAPP_outputs, focal_time = NULL, color_scale = NULL, colors_per_levels = NULL, add_ACE_pies = TRUE, cex_pies = 0.5, rate_type = "net_diversification", keep_initial_colorbreaks = FALSE, add_regime_shifts = TRUE, configuration_type = "MAP", sample_index = 1, adjust_size_to_prob = TRUE, regimes_fill = "grey", regimes_size = 1, regimes_pch = 21, regimes_border_col = "black", regimes_border_width = 1, ..., display_plot = TRUE, PDF_file_path = NULL )
deepSTRAPP_outputs |
List of elements generated with |
focal_time |
Numerical. (Optional) If |
color_scale |
Vector of character string. List of colors to use to build the color scale with |
colors_per_levels |
Named character string. To set the colors to use to map each state/range posterior probabilities. Names = states/ranges; values = colors.
If |
add_ACE_pies |
Logical. Whether to add pies of posterior probabilities of states/ranges at internal nodes on the mapped phylogeny. Default = |
cex_pies |
Numerical. To adjust the size of the ACE pies. Default = |
rate_type |
A character string specifying the type of diversification rates to plot. Must be one of 'speciation', 'extinction' or 'net_diversification' (default). |
keep_initial_colorbreaks |
Logical. Whether to keep the same color breaks as used for the most recent focal time. Typically, the current time (t = 0).
This will only works if you provide the output of |
add_regime_shifts |
Logical. Whether to add the location of regime shifts on the phylogeny (Step 2). Default is |
configuration_type |
A character string specifying how to select the location of regime shifts across posterior samples.
|
sample_index |
Integer. Index of the posterior samples to use to plot the location of regime shifts.
Used only if |
adjust_size_to_prob |
Logical. Whether to scale the size of the symbols showing the location of regime shifts according to
the marginal shift probability of the shift happening on each location/branch. This will only works if there is an |
regimes_fill |
Character string. Set the color of the background of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_size |
Numerical. Set the size of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_pch |
Integer. Set the shape of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_border_col |
Character string. Set the color of the border of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_border_width |
Numerical. Set the width of the border of the symbols showing the location of regime shifts.
Equivalent to the |
... |
Additional graphical arguments to pass down to |
display_plot |
Logical. Whether to display the plot generated in the R console. Default is |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
The main input deepSTRAPP_outputs is the typical output of run_deepSTRAPP_for_focal_time().
It provides information on results of a STRAPP test performed at a given focal_time, and can also encompass
updated phylogenies with mapped trait evolution and diversification rates and regimes shifts if appropriate arguments are set.
return_updated_trait_data_with_Map must be set to TRUE so that the trait data extracted for the given focal_time
and the updated version of mapped phylogeny (contMap/densityMaps) are returned among the outputs under $updated_trait_data_with_Map.
The updated contMap/densityMaps consists in cutting off branches and mappings that are younger than the focal_time.
return_updated_BAMM_object must be set to TRUE so that the updated_BAMM_object with phylogeny and mapped diversification rates
cut-off at the focal_time are returned among the outputs under $updated_BAMM_object.
$MAP_BAMM_object and $MSC_BAMM_object elements are required in $updated_BAMM_object to plot regime shift locations following the
"MAP" or "MSC" configuration_type respectively.
A $MSP_tree element is required to scale the size of the symbols showing the location of regime shifts according marginal shift probabilities.
(If adjust_size_to_prob = TRUE).
Alternatively, the main input deepSTRAPP_outputs can be the output of run_deepSTRAPP_over_time(),
providing results of STRAPP tests over multiple time-steps. In this case, you must provide a focal_time to select the
unique time-step used for plotting.
return_updated_trait_data_with_Map must be set to TRUE so that the trait data extracted and
the updated version of mapped phylogenies (contMap/densityMaps) are returned among the outputs under $updated_trait_data_with_Map_over_time.
return_updated_BAMM_object must be set to TRUE so that the BAMM_objects with phylogeny and mapped diversification rates
cut-off at the specified time-steps are returned among the outputs under $updated_BAMM_objects_over_time.
For plotting all time-steps at once, see plot_traits_vs_rates_on_phylogeny_over_time().
If display_plot = TRUE, the function displays a plot with two facets in the R console:
(Left) A time-calibrated phylogeny displaying the evolution of trait/biogeographic data.
(Right) A time-calibrated phylogeny displaying diversification rates and regime shifts.
If PDF_file_path is provided, the plot will be exported in a PDF file.
Maël Doré
phytools::plot.densityMap() plot_densityMaps_overlay() plot_BAMM_rates()
Functions in deepSTRAPP needed to produce the deepSTRAPP_outputs as input: run_deepSTRAPP_for_focal_time() run_deepSTRAPP_over_time()
Function in deepSTRAPP to plot all time-steps at once: plot_traits_vs_rates_on_phylogeny_over_time()
if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cont_old_calib_10My <- run_deepSTRAPP_for_focal_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_10My, max.level = 1) ## Plot updated contMap vs. updated diversification rates plot_traits_vs_rates_on_phylogeny_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_10My, keep_initial_colorbreaks = TRUE, # To use the same color breaks as for t = 0 My color_scale = c("limegreen", "orange", "red"), # Adjust color scale on contMap legend = TRUE, labels = TRUE, # Show legend and label on BAMM plot cex = 0.7) # Adjust label size on contMap # PDF_file_path = "Updated_maps_cont_10My.pdf") # ----- Example 2: Biogeographic data ----- # ## Load data # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load trait df data(Ponerinae_binary_range_table, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare range data for Old World vs. New World # No overlap in ranges table(Ponerinae_binary_range_table$Old_World, Ponerinae_binary_range_table$New_World) Ponerinae_NO_data <- stats::setNames(object = Ponerinae_binary_range_table$Old_World, nm = Ponerinae_binary_range_table$Taxa) Ponerinae_NO_data <- as.character(Ponerinae_NO_data) Ponerinae_NO_data[Ponerinae_NO_data == "TRUE"] <- "O" # O = Old World Ponerinae_NO_data[Ponerinae_NO_data == "FALSE"] <- "N" # N = New World names(Ponerinae_NO_data) <- Ponerinae_binary_range_table$Taxa table(Ponerinae_NO_data) colors_per_levels <- c("mediumpurple2", "peachpuff2") names(colors_per_levels) <- c("N", "O") # (May take several minutes to run) ## Run evolutionary models Ponerinae_biogeo_data <- prepare_trait_data( tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", phylo = Ponerinae_tree_old_calib, evolutionary_models = "DEC+J", # Default = "DEC" for biogeographic BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "Ponerinae_old_calib", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to display the two outputs nb_simulations = 100, # Reduce to save time (Default = '1000') colors_per_levels = colors_per_levels, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(Ponerinae_biogeo_data_old_calib, package = "deepSTRAPP") ## Explore output str(Ponerinae_biogeo_data_old_calib, 1) ## Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_biogeo_old_calib_10My <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_biogeo_data_old_calib$densityMaps, ace = Ponerinae_biogeo_data_old_calib$ace, tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_biogeo_old_calib_10My, max.level = 1) ## Plot updated contMap vs. updated diversification rates plot_traits_vs_rates_on_phylogeny_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_biogeo_old_calib_10My, # Adjust colors on densityMaps colors_per_levels = c("N" = "dodgerblue2", "O" = "orange"), # Show legend and label on BAMM plot legend = TRUE, labels = TRUE, cex_pies = 0.2, # Adjust size of ACE pies on densityMaps cex = 0.7) # Adjust size of tip labels on BAMM plot # PDF_file_path = "Updated_maps_biogeo_old_calib_10My.pdf") # ----- Example 3: With outputs over_time ----- # ## Load directly outputs from run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, max.level = 1) ## Plot updated contMap vs. updated diversification rates for focal_time = 40My plot_traits_vs_rates_on_phylogeny_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, focal_time = 40, # Select focal_time = 40My # Adjust colors on densityMaps colors_per_levels = c("N" = "dodgerblue2", "O" = "orange"), # Show legend and label on BAMM plot legend = TRUE, labels = TRUE, cex_pies = 0.2, # Adjust size of ACE pies on densityMaps cex = 0.7) # Adjust size of tip labels on BAMM plot # PDF_file_path = "Updated_maps_biogeo_old_calib_40My.pdf") }if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_cont_old_calib_10My <- run_deepSTRAPP_for_focal_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_10My, max.level = 1) ## Plot updated contMap vs. updated diversification rates plot_traits_vs_rates_on_phylogeny_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_10My, keep_initial_colorbreaks = TRUE, # To use the same color breaks as for t = 0 My color_scale = c("limegreen", "orange", "red"), # Adjust color scale on contMap legend = TRUE, labels = TRUE, # Show legend and label on BAMM plot cex = 0.7) # Adjust label size on contMap # PDF_file_path = "Updated_maps_cont_10My.pdf") # ----- Example 2: Biogeographic data ----- # ## Load data # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load trait df data(Ponerinae_binary_range_table, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare range data for Old World vs. New World # No overlap in ranges table(Ponerinae_binary_range_table$Old_World, Ponerinae_binary_range_table$New_World) Ponerinae_NO_data <- stats::setNames(object = Ponerinae_binary_range_table$Old_World, nm = Ponerinae_binary_range_table$Taxa) Ponerinae_NO_data <- as.character(Ponerinae_NO_data) Ponerinae_NO_data[Ponerinae_NO_data == "TRUE"] <- "O" # O = Old World Ponerinae_NO_data[Ponerinae_NO_data == "FALSE"] <- "N" # N = New World names(Ponerinae_NO_data) <- Ponerinae_binary_range_table$Taxa table(Ponerinae_NO_data) colors_per_levels <- c("mediumpurple2", "peachpuff2") names(colors_per_levels) <- c("N", "O") # (May take several minutes to run) ## Run evolutionary models Ponerinae_biogeo_data <- prepare_trait_data( tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", phylo = Ponerinae_tree_old_calib, evolutionary_models = "DEC+J", # Default = "DEC" for biogeographic BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "Ponerinae_old_calib", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to display the two outputs nb_simulations = 100, # Reduce to save time (Default = '1000') colors_per_levels = colors_per_levels, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(Ponerinae_biogeo_data_old_calib, package = "deepSTRAPP") ## Explore output str(Ponerinae_biogeo_data_old_calib, 1) ## Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. Ponerinae_deepSTRAPP_biogeo_old_calib_10My <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_biogeo_data_old_calib$densityMaps, ace = Ponerinae_biogeo_data_old_calib$ace, tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(Ponerinae_deepSTRAPP_biogeo_old_calib_10My, max.level = 1) ## Plot updated contMap vs. updated diversification rates plot_traits_vs_rates_on_phylogeny_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_biogeo_old_calib_10My, # Adjust colors on densityMaps colors_per_levels = c("N" = "dodgerblue2", "O" = "orange"), # Show legend and label on BAMM plot legend = TRUE, labels = TRUE, cex_pies = 0.2, # Adjust size of ACE pies on densityMaps cex = 0.7) # Adjust size of tip labels on BAMM plot # PDF_file_path = "Updated_maps_biogeo_old_calib_10My.pdf") # ----- Example 3: With outputs over_time ----- # ## Load directly outputs from run_deepSTRAPP_over_time() data(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, max.level = 1) ## Plot updated contMap vs. updated diversification rates for focal_time = 40My plot_traits_vs_rates_on_phylogeny_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, focal_time = 40, # Select focal_time = 40My # Adjust colors on densityMaps colors_per_levels = c("N" = "dodgerblue2", "O" = "orange"), # Show legend and label on BAMM plot legend = TRUE, labels = TRUE, cex_pies = 0.2, # Adjust size of ACE pies on densityMaps cex = 0.7) # Adjust size of tip labels on BAMM plot # PDF_file_path = "Updated_maps_biogeo_old_calib_40My.pdf") }
Plot two mapped phylogenies with evolutionary data with branches cut off for each time-step.
Main input = output of a deepSTRAPP run over time using run_deepSTRAPP_over_time()).
Left facet: plot the evolution of trait data/geographic ranges on the left time-calibrated phylogeny.
For continuous data: Each branch is colored according to the estimates value of the traits.
For categorical and biogeographic data: Each branch is colored according to the posterior probability of being in a given state/range. Color for each state/range are overlaid using transparency to produce a single plot for all states/ranges.
Right facet: plot the evolution of diversification rates and location of regime shits estimated from a BAMM
(Bayesian Analysis of Macroevolutionary Mixtures).
Each branch is colored according to the estimated rates of speciation, extinction, or net diversification
stored in an object of class bammdata. Rates can vary along time, thus colors evolved along individual branches.
This function is a wrapper of run_deepSTRAPP_for_focal_time() used to plot mapped phylogenies for
a unique given focal_time.
Returns one plot per focal time in $time_steps.
If a PDF file path is provided in PDF_file_path, the plots will be saved directly in a PDF file,
with one page per focal time in $time_steps.
plot_traits_vs_rates_on_phylogeny_over_time( deepSTRAPP_outputs, color_scale = NULL, colors_per_levels = NULL, add_ACE_pies = TRUE, cex_pies = 0.5, rate_type = "net_diversification", keep_initial_colorbreaks = FALSE, add_regime_shifts = TRUE, configuration_type = "MAP", sample_index = 1, adjust_size_to_prob = TRUE, regimes_fill = "grey", regimes_size = 1, regimes_pch = 21, regimes_border_col = "black", regimes_border_width = 1, ..., display_plot = TRUE, PDF_file_path = NULL )plot_traits_vs_rates_on_phylogeny_over_time( deepSTRAPP_outputs, color_scale = NULL, colors_per_levels = NULL, add_ACE_pies = TRUE, cex_pies = 0.5, rate_type = "net_diversification", keep_initial_colorbreaks = FALSE, add_regime_shifts = TRUE, configuration_type = "MAP", sample_index = 1, adjust_size_to_prob = TRUE, regimes_fill = "grey", regimes_size = 1, regimes_pch = 21, regimes_border_col = "black", regimes_border_width = 1, ..., display_plot = TRUE, PDF_file_path = NULL )
deepSTRAPP_outputs |
List of elements generated with |
color_scale |
Vector of character string. List of colors to use to build the color scale with |
colors_per_levels |
Named character string. To set the colors to use to map each state/range posterior probabilities. Names = states/ranges; values = colors.
If |
add_ACE_pies |
Logical. Whether to add pies of posterior probabilities of states/ranges at internal nodes on the mapped phylogeny. Default = |
cex_pies |
Numerical. To adjust the size of the ACE pies. Default = |
rate_type |
A character string specifying the type of diversification rates to plot. Must be one of 'speciation', 'extinction' or 'net_diversification' (default). |
keep_initial_colorbreaks |
Logical. Whether to keep the same color breaks as used for the most recent focal time. Typically, the current time (t = 0). Default = |
add_regime_shifts |
Logical. Whether to add the location of regime shifts on the phylogeny (Step 2). Default is |
configuration_type |
A character string specifying how to select the location of regime shifts across posterior samples.
|
sample_index |
Integer. Index of the posterior samples to use to plot the location of regime shifts.
Used only if |
adjust_size_to_prob |
Logical. Whether to scale the size of the symbols showing the location of regime shifts according to
the marginal shift probability of the shift happening on each location/branch. This will only works if there is an |
regimes_fill |
Character string. Set the color of the background of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_size |
Numerical. Set the size of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_pch |
Integer. Set the shape of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_border_col |
Character string. Set the color of the border of the symbols showing the location of regime shifts.
Equivalent to the |
regimes_border_width |
Numerical. Set the width of the border of the symbols showing the location of regime shifts.
Equivalent to the |
... |
Additional graphical arguments to pass down to |
display_plot |
Logical. Whether to display the plot generated in the R console. Default is |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
The main input deepSTRAPP_outputs is the typical output of run_deepSTRAPP_over_time().
It provides information on results of a STRAPP run performed over multiple time-steps, and can also encompass
updated phylogenies with mapped trait evolution and diversification rates and regimes shifts if appropriate arguments are set.
return_updated_trait_data_with_Map must be set to TRUE so that the trait data extracted for the given $time_steps
and the list of updated versions of mapped phylogenies (contMap/densityMaps) is returned among the outputs under $updated_trait_data_with_Map_over_time.
The updated contMap/densityMaps consist in cutting off branches and mappings that are younger than the successive $time_steps.
return_updated_BAMM_object must be set to TRUE so that the list of updated_BAMM_object with phylogenies and mapped diversification rates
cut-off at the successive $time_steps are returned among the outputs under $updated_BAMM_objects_over_time.
$MAP_BAMM_object and $MSC_BAMM_object elements are required in the elements of $updated_BAMM_objects_over_time to plot regime shift locations
following the "MAP" or "MSC" configuration_type respectively.
A $MSP_tree element is required to scale the size of the symbols showing the location of regime shifts according marginal shift probabilities.
(If adjust_size_to_prob = TRUE).
Alternatively, the main input deepSTRAPP_outputs can be the output of run_deepSTRAPP_over_time(),
providing results of STRAPP tests over multiple time-steps. In this case, you must provide a focal_time to select the
unique time-step used for plotting.
return_updated_trait_data_with_Map must be set to TRUE so that the trait data extracted and
the updated version of mapped phylogenies (contMap/densityMaps) are returned among the outputs under $updated_trait_data_with_Map_over_time.
return_updated_BAMM_object must be set to TRUE so that the BAMM_objects with phylogeny and mapped diversification rates
cut-off at the specified time-steps are returned among the outputs under $updated_BAMM_objects_over_time.
For plotting a single time-step (i.e., $focal_time) at once, see plot_traits_vs_rates_on_phylogeny_for_focal_time().
If display_plot = TRUE, the function displays the successive plots with two facets in the R console:
(Left) A time-calibrated phylogeny displaying the evolution of trait/biogeographic data.
(Right) A time-calibrated phylogeny displaying diversification rates and regime shifts. A plot per time-steps is generated and displayed successively in the console.
If PDF_file_path is provided, the plots will be exported in a unique PDF file with one page per time-step.
Maël Doré
plot_contMap() phytools::plot.densityMap() plot_densityMaps_overlay() plot_BAMM_rates()
Function in deepSTRAPP needed to produce the deepSTRAPP_outputs as input: run_deepSTRAPP_over_time()
Function in deepSTRAPP to a single time-step at once: plot_traits_vs_rates_on_phylogeny_for_focal_time()
if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) ## Run deepSTRAPP on net diversification rates Ponerinae_deepSTRAPP_cont_old_calib_0_40 <- run_deepSTRAPP_over_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly trait data output data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Plot updated contMap vs. updated diversification rates plot_traits_vs_rates_on_phylogeny_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, # To use the same color breaks as for t = 0 My in all BAMM plots keep_initial_colorbreaks = TRUE, color_scale = c("limegreen", "orange", "red"), # Adjust color scale on contMap legend = TRUE, labels = TRUE, # Show legend and label on BAMM plot cex = 0.7) # Adjust label size on contMap # PDF_file_path = "Updated_maps_cont_old_calib_0_40My.pdf") # ----- Example 2: Biogeographic data ----- # ## Load data # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load trait df data(Ponerinae_binary_range_table, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare range data for Old World vs. New World # No overlap in ranges table(Ponerinae_binary_range_table$Old_World, Ponerinae_binary_range_table$New_World) Ponerinae_NO_data <- stats::setNames(object = Ponerinae_binary_range_table$Old_World, nm = Ponerinae_binary_range_table$Taxa) Ponerinae_NO_data <- as.character(Ponerinae_NO_data) Ponerinae_NO_data[Ponerinae_NO_data == "TRUE"] <- "O" # O = Old World Ponerinae_NO_data[Ponerinae_NO_data == "FALSE"] <- "N" # N = New World names(Ponerinae_NO_data) <- Ponerinae_binary_range_table$Taxa table(Ponerinae_NO_data) colors_per_levels <- c("mediumpurple2", "peachpuff2") names(colors_per_levels) <- c("N", "O") # (May take several minutes to run) ## Run evolutionary models Ponerinae_biogeo_data <- prepare_trait_data( tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", phylo = Ponerinae_tree_old_calib, evolutionary_models = "DEC+J", # Default = "DEC" for biogeographic BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "Ponerinae_old_calib", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to use only unique areas "A" and "B" nb_simulations = 100, # Reduce to save time (Default = '1000') colors_per_levels = colors_per_levels, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(Ponerinae_biogeo_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows from 0 to 40 Mya. time_range <- c(0, 40) time_step_duration <- 5 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for time-steps = 0 to 40 Mya. Ponerinae_deepSTRAPP_biogeo_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_biogeo_data_old_calib$densityMaps, ace = Ponerinae_biogeo_data_old_calib$ace, tip_data = Ponerinae_ON_tip_data, trait_data_type = "biogeographic", BAMM_object = Ponerinae_BAMM_object_old_calib, time_range = time_range, time_step_duration = time_step_duration, seed = 1234, # Set seed for reproducibility alpha = 0.10, # Select a generous level of significance for the sake of the example rate_type = "net_diversification", return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly output data(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, max.level = 1) ## Plot updated contMap vs. updated diversification rates plot_traits_vs_rates_on_phylogeny_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, # Adjust colors on densityMaps colors_per_levels = c("N" = "dodgerblue2", "O" = "orange"), legend = TRUE, labels = TRUE, # Show legend and label on BAMM plot cex_pies = 0.2, # Adjust size of ACE pies on densityMaps cex = 0.7) # Adjust label size on contMap # PDF_file_path = "Updated_maps_biogeo_old_calib_0_40My.pdf") }if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) ## Run deepSTRAPP on net diversification rates Ponerinae_deepSTRAPP_cont_old_calib_0_40 <- run_deepSTRAPP_over_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly trait data output data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Plot updated contMap vs. updated diversification rates plot_traits_vs_rates_on_phylogeny_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, # To use the same color breaks as for t = 0 My in all BAMM plots keep_initial_colorbreaks = TRUE, color_scale = c("limegreen", "orange", "red"), # Adjust color scale on contMap legend = TRUE, labels = TRUE, # Show legend and label on BAMM plot cex = 0.7) # Adjust label size on contMap # PDF_file_path = "Updated_maps_cont_old_calib_0_40My.pdf") # ----- Example 2: Biogeographic data ----- # ## Load data # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load trait df data(Ponerinae_binary_range_table, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare range data for Old World vs. New World # No overlap in ranges table(Ponerinae_binary_range_table$Old_World, Ponerinae_binary_range_table$New_World) Ponerinae_NO_data <- stats::setNames(object = Ponerinae_binary_range_table$Old_World, nm = Ponerinae_binary_range_table$Taxa) Ponerinae_NO_data <- as.character(Ponerinae_NO_data) Ponerinae_NO_data[Ponerinae_NO_data == "TRUE"] <- "O" # O = Old World Ponerinae_NO_data[Ponerinae_NO_data == "FALSE"] <- "N" # N = New World names(Ponerinae_NO_data) <- Ponerinae_binary_range_table$Taxa table(Ponerinae_NO_data) colors_per_levels <- c("mediumpurple2", "peachpuff2") names(colors_per_levels) <- c("N", "O") # (May take several minutes to run) ## Run evolutionary models Ponerinae_biogeo_data <- prepare_trait_data( tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", phylo = Ponerinae_tree_old_calib, evolutionary_models = "DEC+J", # Default = "DEC" for biogeographic BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "Ponerinae_old_calib", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to use only unique areas "A" and "B" nb_simulations = 100, # Reduce to save time (Default = '1000') colors_per_levels = colors_per_levels, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(Ponerinae_biogeo_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows from 0 to 40 Mya. time_range <- c(0, 40) time_step_duration <- 5 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for time-steps = 0 to 40 Mya. Ponerinae_deepSTRAPP_biogeo_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_biogeo_data_old_calib$densityMaps, ace = Ponerinae_biogeo_data_old_calib$ace, tip_data = Ponerinae_ON_tip_data, trait_data_type = "biogeographic", BAMM_object = Ponerinae_BAMM_object_old_calib, time_range = time_range, time_step_duration = time_step_duration, seed = 1234, # Set seed for reproducibility alpha = 0.10, # Select a generous level of significance for the sake of the example rate_type = "net_diversification", return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly output data(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, max.level = 1) ## Plot updated contMap vs. updated diversification rates plot_traits_vs_rates_on_phylogeny_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, # Adjust colors on densityMaps colors_per_levels = c("N" = "dodgerblue2", "O" = "orange"), legend = TRUE, labels = TRUE, # Show legend and label on BAMM plot cex_pies = 0.2, # Adjust size of ACE pies on densityMaps cex = 0.7) # Adjust label size on contMap # PDF_file_path = "Updated_maps_biogeo_old_calib_0_40My.pdf") }
A data.frame of range location for the 1534 extant ponerine ant taxa (Ponerinae subfamily).
Source: Doré, M., Borowiec, M. L., Branstetter, M. G., Camacho, G. P., Fisher, B. L., Longino, J. T., Ward, P. S., Blaimer, B. B. (2025). Evolutionary history of ponerine ants highlights how the timing of dispersal events shapes modern biodiversity. Nature Communications, 16, 8297. doi:10.1038/s41467-025-63709-3
data(Ponerinae_binary_range_table)data(Ponerinae_binary_range_table)
A data.frame with 1534 rows and 10 columns.
A data.frame of range locations covering the 1534 extant ponerine ant taxa (Ponerinae subfamily).
$Taxa Character string. Names of ponerinae ant taxa.
$Afrotropics Logical. Whether the range of the taxa extends to Afrotropics.
$Australasia Logical. Whether the range of the taxa extends to Australasia.
$Indomalaya Logical. Whether the range of the taxa extends to Indomalaya.
$Nearctic Logical. Whether the range of the taxa extends to Nearctic.
$Neotropics Logical. Whether the range of the taxa extends to Neotropics.
$Eastern_Palearctic Logical. Whether the range of the taxa extends to Eastern Palearctic.
$Western_Palearctic Logical. Whether the range of the taxa extends to Western Palearctic.
$Old_World Logical. Whether the range of the taxa extends to the Old World: encompassing any bioregion among
Afrotropics, Australasia, Indomalaya, Eastern Palearctic, or Western Palearctic.
$New_World Logical. Whether the range of the taxa extends to the New World: encompassing any bioregion among
Nearctic, or Neotropics.
Doré, M., Borowiec, M. L., Branstetter, M. G., Camacho, G. P., Fisher, B. L., Longino, J. T., Ward, P. S., Blaimer, B. B. (2025). Evolutionary history of ponerine ants highlights how the timing of dispersal events shapes modern biodiversity. Nature Communications, 16, 8297. doi:10.1038/s41467-025-63709-3
A list containing geographic ranges data of Ponerinae mapped on the old ill-calibrated phylogeny,
modeled with R package BioGeoBEARS. Ranges are labeled between "Old World" (O) and New World (N).
This object was obtained with prepare_trait_data().
The phylogeny used is also NOT a properly time-calibrated phylogeny. It uses an ill-designed old calibration for illustrative purposes.
Source: Doré, M., Borowiec, M. L., Branstetter, M. G., Camacho, G. P., Fisher, B. L., Longino, J. T., Ward, P. S., Blaimer, B. B. (2025). Evolutionary history of ponerine ants highlights how the timing of dispersal events shapes modern biodiversity. Nature Communications, 16, 8297. doi:10.1038/s41467-025-63709-3
data(Ponerinae_biogeo_data_old_calib)data(Ponerinae_biogeo_data_old_calib)
A list with 6 elements.
A list of five objects containing information on the evolution of feeding habits in eels.
This object was obtained with prepare_trait_data().
$densityMaps List of objects of class "densityMap that contains a phylogenetic tree and associated mapping of probability
to harbor a given range along branches. The list contains only a "densityMap per unique areas because split_multi_area_ranges was set to TRUE.
In this case, unique areas are "N" (= "New World") and "O" (= "Old World)
$densityMaps_all_ranges List of objects of class "densityMap that contains a phylogenetic tree and associated mapping of probability
to harbor a given range along branches. The list contains one "densityMap per range found along branches during the simulated biogeographic histories.
Here those ranges are "N" (= "New World"), "O" (= "Old World), and "NO" for multi-area ranges encompassing both regions.
$trait_data_type Character string. Record the type of trait data. Here: "biogeographic".
$ace Numerical matrix that record the posterior probabilities of ancestral ranges estimated at internal nodes.
Only unique areas (i.e., "N" and "O") are considered among the ranges. Multi-area ranges (i.e., "NO") have been split among unique ranges.
Rows are internal nodes. Columns are ranges. Values are posterior probabilities of each range per node.
$ace_all_ranges Numerical matrix that record the posterior probabilities of ancestral ranges estimated at internal nodes.
All ranges observed along branches during the simulated biogeographic histories are present (i.e., "N", "O", and "NO").
Rows are internal nodes. Columns are ranges. Values are posterior probabilities of each range per node.
$model_selection_df Data.frame that summarizes model comparisons used to select the best fitting model.
Doré, M., Borowiec, M. L., Branstetter, M. G., Camacho, G. P., Fisher, B. L., Longino, J. T., Ward, P. S., Blaimer, B. B. (2025). Evolutionary history of ponerine ants highlights how the timing of dispersal events shapes modern biodiversity. Nature Communications, 16, 8297. doi:10.1038/s41467-025-63709-3
A list containing fake size data of Ponerinae ants mapped on the phylogeny,
modeled with geiger::fitDiscrete. This object was obtained with prepare_trait_data().
This is NOT real biological/ecological data. They were designed for illustrative purposes only.
The phylogeny used is also NOT a properly time-calibrated phylogeny. It uses an ill-designed old calibration for illustrative purposes.
data(Ponerinae_cat_2lvl_data_old_calib)data(Ponerinae_cat_2lvl_data_old_calib)
A list with 5 elements.
A list of five objects containing information on the evolution of fake size data in Ponerinae ants.
This object was obtained with prepare_trait_data().
$densityMaps List of two objects of class "densityMap that contains a phylogenetic tree and associated mapping of probability
to harbor a given state along branches. The list contains one "densityMap per state found in the tip_data (i.e., "large" and "small").
$trait_data_type Character string. Record the type of trait data. Here: "categorical".
$ace Numerical matrix that record the posterior probabilities of ancestral states (characters) estimates (ACE) at internal nodes.
Rows are internal nodes. Columns are states (i.e., "large" and "small"). Values are posterior probabilities of each state per node.
$best_model_fit List that provides the output of the best fitting model (Here: ARD model).
$model_selection_df Data.frame that summarizes model comparisons used to select the best fitting model.
A list containing fake habitat data of Ponerinae ants mapped on the phylogeny,
modeled with geiger::fitDiscrete. This object was obtained with prepare_trait_data().
This is NOT real biological/ecological data. They were designed for illustrative purposes only.
The phylogeny used is also NOT a properly time-calibrated phylogeny. It uses an ill-designed old calibration for illustrative purposes.
data(Ponerinae_cat_3lvl_data_old_calib)data(Ponerinae_cat_3lvl_data_old_calib)
A list with 5 elements.
A list of five objects containing information on the evolution of fake habitat data in Ponerinae ants.
This object was obtained with prepare_trait_data().
$densityMaps List of three objects of class "densityMap that contains a phylogenetic tree and associated mapping of probability
to harbor a given state along branches. The list contains one "densityMap per state found in the tip_data (i.e., "arboreal", "subterranean", and "terricolous").
$trait_data_type Character string. Record the type of trait data. Here: "categorical".
$ace Numerical matrix that record the posterior probabilities of ancestral states (characters) estimates (ACE) at internal nodes.
Rows are internal nodes. Columns are states (i.e., "large" and "small"). Values are posterior probabilities of each state per node.
$best_model_fit List that provides the output of the best fitting model (Here: ARD model).
$model_selection_df Data.frame that summarizes model comparisons used to select the best fitting model.
A list containing estimated values for a fake continuous trait mapped on the Ponerinae ant phylogeny,
modeled with geiger::fitContinuous. This object was obtained with extract_most_likely_trait_values_for_focal_time()
applied on trait evolution data obtained with prepare_trait_data().
The phylogeny used is NOT a properly time-calibrated phylogeny. It uses an ill-designed old calibration for illustrative purposes.
data(Ponerinae_trait_cont_tip_data_10My)data(Ponerinae_trait_cont_tip_data_10My)
A list with 4 elements.
A list of four elements containing information on the evolution of a continuous trait in Ponerinae ants extracted for 10 Mya.
$trait_data Named vector of numerical values. Names are the taxa or internal tipward node ID associated with the values.
Values are the continuous trait data estimated along branches for 10 Mya.
$focal_time Numerical. Time in the past at which the trait data where extracted.
$trait_data_type Character string. Record the type of trait data. Here: "continuous".
$contMap A phylogenetic tree and associated mapping of estimated trait values. It was updated such as the tips correspond to
lineages found 10 Mya (i.e., at the focal time in the past).
A data.frame of fake trait data covering the 1534 extant ponerine ant taxa (Ponerinae subfamily). This is NOT real biological/ecological data. They were designed for illustrative purposes only.
data(Ponerinae_trait_tip_data)data(Ponerinae_trait_tip_data)
A data.frame with 1534 rows and 4 columns.
A data.frame of fake trait data covering the 1534 extant ponerine ant taxa (Ponerinae subfamily).
$Taxa Character string. Names of ponerinae ant taxa.
$fake_cont_tip_data Numeric. Fake continuous trait data.
$fake_cat_2lvl_tip_data Vector of character strings. Fake categorical size data with two levels: "large" and "small".
$fake_cat_3lvl_tip_data Vector of character strings. Fake categorical habitat data with three levels: "arboreal", "subterranean", and "terricolous".
A phylo object describing the time-calibrated phylogeny of the 1534 extant ponerine ants (Ponerinae subfamily).
Source: Doré, M., Borowiec, M. L., Branstetter, M. G., Camacho, G. P., Fisher, B. L., Longino, J. T., Ward, P. S., Blaimer, B. B. (2025). Evolutionary history of ponerine ants highlights how the timing of dispersal events shapes modern biodiversity. Nature Communications, 16, 8297. doi:10.1038/s41467-025-63709-3
data(Ponerinae_tree)data(Ponerinae_tree)
A phylo object with 4 elements.
A time-calibrated phylogeny as a phylo object with 4 elements.
$edge Matrix of integers. Defines the tree topology by providing rootward and tipward node ID of each edge.
$edge.length Vector of numerical. Length of edges/branches.
$Nnode Integer. Number of internal nodes.
$tip.label Vector of character strings. Labels of all tips.
Doré, M., Borowiec, M. L., Branstetter, M. G., Camacho, G. P., Fisher, B. L., Longino, J. T., Ward, P. S., Blaimer, B. B. (2025). Evolutionary history of ponerine ants highlights how the timing of dispersal events shapes modern biodiversity. Nature Communications, 16, 8297. doi:10.1038/s41467-025-63709-3
A phylo object describing the time-calibrated phylogeny of the 1534 extant ponerine ants (Ponerinae subfamily).
This is NOT a properly time-calibrated phylogeny. It uses an ill-designed old calibration for illustrative purposes.
For a proper time-calibrated phylogeny of ponerine ants, see the Ponerinae_tree object in deepSTRAPP.
Source: Doré, M., Borowiec, M. L., Branstetter, M. G., Camacho, G. P., Fisher, B. L., Longino, J. T., Ward, P. S., Blaimer, B. B. (2025). Evolutionary history of ponerine ants highlights how the timing of dispersal events shapes modern biodiversity. Nature Communications, 16, 8297. doi:10.1038/s41467-025-63709-3
data(Ponerinae_tree_old_calib)data(Ponerinae_tree_old_calib)
A phylo object with 4 elements.
A time-calibrated phylogeny as a phylo object with 4 elements.
$edge Matrix of integers. Defines the tree topology by providing rootward and tipward node ID of each edge.
$edge.length Vector of numerical. Length of edges/branches.
$Nnode Integer. Number of internal nodes.
$tip.label Vector of character strings. Labels of all tips.
Doré, M., Borowiec, M. L., Branstetter, M. G., Camacho, G. P., Fisher, B. L., Longino, J. T., Ward, P. S., Blaimer, B. B. (2025). Evolutionary history of ponerine ants highlights how the timing of dispersal events shapes modern biodiversity. Nature Communications, 16, 8297. doi:10.1038/s41467-025-63709-3
Run a full BAMM (Bayesian Analysis of Macroevolutionary Mixtures) workflow
to produce a BAMM_object that contains a phylogenetic tree and associated diversification rates
mapped along branches, across selected posterior samples:
Step 1: Set BAMM - Record BAMM settings and generate all input files needed for BAMM.
Step 2: Run BAMM - Run BAMM and move output files in dedicated directory.
Step 3: Evaluate BAMM - Produce evaluation plots and ESS data.
Step 4: Import BAMM outputs - Load BAMM_object in R and subset posterior samples.
Step 5: Clean BAMM files - Remove files generated during the BAMM run.
The BAMM_object output is typically used as input to run deepSTRAPP with run_deepSTRAPP_for_focal_time()
or run_deepSTRAPP_over_time(). Diversification rates and regimes shift can be visualized with plot_BAMM_rates().
BAMM is a model of diversification for time-calibrated phylogenies that explores complex diversification dynamics by allowing multiple regime shifts across clades without a priori hypotheses on the location of such shifts. It uses reversible jump Markov chain Monte Carlo (rjMCMC) to automatically explore a vast range of models with different speciation and extinction rates, and different number and location of regime shits.
This function will work only if you have the BAMM C++ program installed in your machine.
See the BAMM website: http://bamm-project.org/ and the companion R package {BAMMtools}.
prepare_diversification_data( BAMM_install_directory_path, phylo, prefix_for_files = NULL, seed = NULL, numberOfGenerations = 10^7, globalSamplingFraction = 1, sampleProbsFilename = NULL, expectedNumberOfShifts = NULL, eventDataWriteFreq = NULL, burn_in = 0.25, nb_posterior_samples = 1000, additional_BAMM_settings = list(), BAMM_output_directory_path = NULL, keep_BAMM_outputs = TRUE, MAP_odd_ratio_threshold = 5, skip_evaluations = FALSE, plot_evaluations = TRUE, save_evaluations = TRUE )prepare_diversification_data( BAMM_install_directory_path, phylo, prefix_for_files = NULL, seed = NULL, numberOfGenerations = 10^7, globalSamplingFraction = 1, sampleProbsFilename = NULL, expectedNumberOfShifts = NULL, eventDataWriteFreq = NULL, burn_in = 0.25, nb_posterior_samples = 1000, additional_BAMM_settings = list(), BAMM_output_directory_path = NULL, keep_BAMM_outputs = TRUE, MAP_odd_ratio_threshold = 5, skip_evaluations = FALSE, plot_evaluations = TRUE, save_evaluations = TRUE )
BAMM_install_directory_path |
Character string. The path to the directory where BAMM is. Use '/' to separate directory and sub-directories. The path must end with '/'. |
phylo |
Time-calibrated phylogeny. Object of class |
prefix_for_files |
Character string. Prefix to add to all BAMM files stored in the |
seed |
Integer. Set the seed to ensure reproducibility. Default is |
numberOfGenerations |
Integer. Number of steps in the MCMC run. It should be set high enough to reach the equilibrium distribution
and allows posterior samples to be uncorrelated. Check the Effective Sample Size of parameters with coda::effectiveSize() in the Evaluation step.
Default value is |
globalSamplingFraction |
Numerical. Global sampling fraction representing the overall proportion of terminals in the phylogeny compared to
the estimated overall richness in the clade. It acts as a multiplier on the rates needed to achieve such extant diversity.
Default is |
sampleProbsFilename |
Character string. The path to the |
expectedNumberOfShifts |
Integer. Set the expected number of regime shifts. It acts as an hyperparameter controlling the exponential prior distribution
used to modulate reversible jumps across model configurations in the rjMCMC run.
If set to |
eventDataWriteFreq |
Integer. Set the frequency in which to write the event data to the output file = the sampling frequency of posterior samples.
If set to |
burn_in |
Numerical. Proportion of posterior samples removed from the BAMM output to ensure that the remaining samples where drawn once the equilibrium distribution was reached.
This can be evaluated looking at the MCMC trace (see Evaluation step). Default is |
nb_posterior_samples |
Numerical. Number of posterior samples to extract, after removing the burn-in, in the final |
additional_BAMM_settings |
List of named elements. Additional settings options for BAMM provided as a list of named arguments.
Ex: |
BAMM_output_directory_path |
Character string. The path to the directory used to store input/output files generated. Use '/' to separate directory and subdirectories. It must end with '/'. |
keep_BAMM_outputs |
Logical. Whether the |
MAP_odd_ratio_threshold |
Numerical. Controls the definition of 'core-shifts' used to distinguish across configurations when fetching the MAP samples.
Shifts that have an odd-ratio of marginal posterior probability / prior lower than |
skip_evaluations |
Logical. Whether to skip the Evaluation step including MCMC trace, ESS, and prior/posterior comparisons for expected number of shifts. Default = |
plot_evaluations |
Logical. Whether to display the plots generated during the Evaluation step: MCMC trace, and prior/posterior comparisons for expected number of shifts. Default = |
save_evaluations |
Logical. Whether to save the outputs of evaluations in a table (ESS), and PDFs (MCMC trace, and prior/posterior comparisons for expected number of shifts)
in the |
This function runs a full BAMM (Bayesian Analysis of Macroevolutionary Mixtures) workflow
to produce a BAMM_object that contains a phylogenetic tree and associated diversification rates
mapped along branches, across selected posterior samples.
Step 1: Set BAMM
Produces a tree file for the phylogeny. Default file: 'phylogeny.tree'.
Save configuration settings used for the BAMM run. Default file: 'config_file.txt'.
Save default priors generated by BAMMtools::setBAMMpriors based on the phylogeny. Default file: 'priors.txt'.
Step 2: Run BAMM
Run BAMM using the system console
Move output files in dedicated BAMM_output_directory. Default directory is ./BAMM_outputs/.
'run_info.txt' containing a summary of your parameters/settings.
'mcmc_log.txt' containing raw MCMC information useful in diagnosing convergence.
'event_data.txt' containing all evolutionary rate parameters and their topological mappings.
'chain_swap.txt' containing data about each chain swap proposal (when a proposal occurred, which chains might be swapped, and whether the swap was accepted).
'acceptance_info.txt' containing the history of acceptance/proposal of MCMC steps (If additional setting outputAcceptanceInfo is set to 1).
Step 3: Evaluate BAMM
Plot the MCMC trace = evolution of logLik across MCMC generations. Output file = 'MCMC_trace_logLik.pdf'.
Compute the Effective Sample Size (ESS) across posterior samples (after removing burn-in) using coda::effectiveSize().
This is a way to evaluate if your MCMC runs has enough generations to produce robust estimates. Ideally, ESS should be higher than 200.
Output file = 'ESS_df.csv'.
Plot the comparison of prior and posterior distributions of the number of regime shifts with BAMMtools::plotPrior.
Output file = 'PP_nb_shifts_plot.pdf'.
A good value for expectedNumberOfShifts is one with high similarities between the distributions
hinting that the information in the data coincides with your expectations for the number of regime shifts.
The best practice consists in trying several values to control if it affects or not the final output.
Step 4: Import BAMM outputs
Load BAMM outputs with BAMMtools::getEventData.
Subset posterior samples to the requested nb_posterior_samples with BAMMtools::subsetEventData.
Record the $expectedNumberOfShifts used to set the prior. This is useful for downstream analyses involving comparison of prior vs. posterior probabilities
(See BAMMtools::distinctShiftConfigurations()).
Record the marginal posterior probability of regime shift along branches based on the proportion of samples harboring a regime shift along each branch.
(See BAMMtools::marginalShiftProbsTree()). Result is stored in $MSP_tree as phylogenetic tree with $edge.length scaled to the marginal posterior probability.
Extract the Maximum A Posteriori probability (MAP) configuration = the configuration of regime shift location found the most frequently among the posterior samples.
(See BAMMtools::getBestShiftConfiguration()). This ignores shifts that have an odd-ratio of marginal posterior probability / prior lower than MAP_odd_ratio_threshold
to avoid noise from non-core shifts. MAP sample indices are stored in $MAP_indices. Diversification rates and shift locations on branches are then averaged across all MAP samples and
recorded as an object of class "bammdata" in $MAP_BAMM_object with a single $eventData table used to plot regime shifts on the phylogeny with plot_BAMM_rates().
Extract the Maximum Shift Credibility (MSC) configuration = the configuration of regime shift location with the highest product of marginal probabilities across branches.
(See BAMMtools::maximumShiftCredibility()). MSC sample indices are stored in $MSC_indices. Diversification rates and shift locations on branches are then averaged across all MSC samples and
recorded as an object of class "bammdata" in $MSC_BAMM_object with a single $eventData table used to plot regime shifts on the phylogeny with plot_BAMM_rates().
Step 5: Clean BAMM files
Remove files generated in Steps 1 & 2 if keep_BAMM_outputs = FALSE.
Delete the BAMM_output_directory if empty after cleaning files.
The BAMM_object output:
is typically used as input to run deepSTRAPP with run_deepSTRAPP_for_focal_time() or run_deepSTRAPP_over_time().
can be used to extract rates and regimes for any focal_time in the past with update_rates_and_regimes_for_focal_time().
can be used to map diversification rates and regime shifts on the phylogeny with plot_BAMM_rates().
The function returns a BAMM_object of class "bammdata" which is a list with at least 22 elements.
Phylogeny-related elements used to plot a phylogeny with ape::plot.phylo():
$edge Matrix of integers. Defines the tree topology by providing rootward and tipward node ID of each edge.
$Nnode Integer. Number of internal nodes.
$tip.label Vector of character strings. Labels of all tips.
$edge.length Vector of numerical. Length of edges/branches.
$node.label Vector of character strings. Labels of all internal nodes. (Present only if present in the initial BAMM_object)
BAMM internal elements used for tree exploration:
$begin Vector of numerical. Absolute time since root of edge/branch start (rootward).
$end Vector of numerical. Absolute time since root of edge/branch end (tipward).
$downseq Vector of integers. Order of node visits when using a pre-order tree traversal.
$lastvisit ID of the last node visited when starting from the node in the corresponding position in $downseq.
BAMM elements summarizing diversification data:
$numberEvents Vector of integer. Number of events/macroevolutionary regimes (k+1) recorded in each posterior configuration. k = number of shifts.
$eventData List of data.frames. One per posterior sample. Records shift events and macroevolutionary regimes parameters. 1st line = Background root regime.
$eventVectors List of integer vectors. One per posterior sample. Record regime ID per branches.
$tipStates List of named integer vectors. One per posterior sample. Record regime ID per tips.
$tipLambda List of named numerical vectors. One per posterior sample. Record speciation rates per tips.
$tipMu List of named numerical vectors. One per posterior sample. Record extinction rates per tips.
$eventBranchSegs List of matrix of numerical. One per posterior sample. Record regime ID per segments of branches.
$meanTipLambda Vector of named numerical. Mean tip speciation rates across all posterior configurations of tips.
$meanTipMu Vector of named numerical. Mean tip extinction rates across all posterior configurations of tips.
$type Character string. Set the type of data modeled with BAMM. Should be "diversification".
Additional elements providing key information for downstream analyses:
$expectedNumberOfShifts Integer. The expected number of regime shifts used to set the prior in BAMM.
$MSP_tree Object of class phylo. List of 4 elements duplicating information from the Phylogeny-related elements above,
except $MSP_tree$edge.length is recording the Marginal Shift Probability of each branch (i.e., the probability of a regime shift to occur along each branch)
$MAP_indices Vector of integers. The indices of the Maximum A Posteriori probability (MAP) configurations among the posterior samples.
$MAP_BAMM_object. List of 18 elements of class '"bammdata" recording the mean rates and regime shift locations found across
the Maximum A Posteriori probability (MAP) configurations. All BAMM elements summarizing diversification data holds a single entry describing
this mean diversification history.
$MSC_indices Vector of integers. The indices of the Maximum Shift Credibility (MSC) configurations among the posterior samples.
$MSC_BAMM_object List of 18 elements of class '"bammdata" recording the mean rates and regime shift locations found across
the Maximum Shift Credibility (MSC) configurations. All BAMM elements summarizing diversification data holds a single entry describing
this mean diversification history.
The function also produces files listed in the Details section and stored in the the BAMM_output_directory.
This function relies on BAMM to provide a reliable solution to map diversification rates and regime shifts on a time-calibrated phylogeny
and obtain the BAMM_object object needed to run the deepSTRAPP workflow (run_deepSTRAPP_for_focal_time, run_deepSTRAPP_over_time).
However, it is one option among others for modeling diversification on phylogenies.
You may wish to explore alternatives models such as LSBDS model in RevBayes (Höhna et al., 2016), the MTBD model (Barido-Sottani et al., 2020),
or the ClaDS2 model (Maliet et al., 2019) for your own data.
However, you will need Bayesian models that infer regime shifts to be able to perform STRAPP tests (Rabosky & Huang, 2016).
Additionally, you need to format the model output such as in BAMM_object, so it can be used in a deepSTRAPP workflow.
This function perform a single BAMM run to infer diversification rates and regime shifts. Due to the stochastic nature of the exploration of the parameter space with MCMC process, best practice recommend to ran multiple runs and check for convergence of the MCMC traces, ensuring that the region of high probability has been reached by your MCMC runs.
Maël Doré
For BAMM: Rabosky, D. L. (2014). Automatic detection of key innovations, rate shifts, and diversity-dependence on phylogenetic trees. PloS one, 9(2), e89543. doi:10.1371/journal.pone.0089543. Website: http://bamm-project.org/.
For {BAMMtools}: Rabosky, D. L., Grundler, M., Anderson, C., Title, P., Shi, J. J., Brown, J. W., ... & Larson, J. G. (2014).
BAMM tools: an R package for the analysis of evolutionary dynamics on phylogenetic trees. Methods in Ecology and Evolution, 5(7), 701-707.
doi:10.1111/2041-210X.12199
run_deepSTRAPP_for_focal_time() run_deepSTRAPP_over_time() update_rates_and_regimes_for_focal_time() prepare_trait_data() plot_BAMM_rates()
For a guided tutorial, see this vignette: vignette("model_diversification_dynamics", package = "deepSTRAPP")
# ----- Example 1: Whale phylogeny ----- # library(phytools) data(whale.tree) ## Not run: ## You need to install the BAMM C++ software locally prior to run this function # Visit the official BAMM website (\url{http://bamm-project.org/}) for information. # Run BAMM workflow with deepSTRAPP whale_BAMM_object <- prepare_diversification_data( BAMM_install_directory_path = "./software/bamm-2.5.0/", phylo = whale.tree, prefix_for_files = "whale", numberOfGenerations = 100000, # Set low for the example BAMM_output_directory_path = tempdir(), # Can be adjusted such as "./BAMM_outputs/" keep_BAMM_outputs = FALSE, # Adjust if needed ) ## End(Not run) # Load directly the result data(whale_BAMM_object) # Explore output str(whale_BAMM_object, 1) # Plot mean net diversification rates and regime shifts on the phylogeny plot_BAMM_rates(whale_BAMM_object, cex = 0.5, labels = TRUE, legend = TRUE) # ----- Example 2: Ponerinae phylogeny ----- # # Load phylogeny data("Ponerinae_tree", package = "deepSTRAPP") plot(Ponerinae_tree, show.tip.label = FALSE) ## Not run: ## You need to install the BAMM C++ software locally prior to run this function # Visit the official BAMM website (http://bamm-project.org/) for information. # Run BAMM workflow with deepSTRAPP Ponerinae_BAMM_object <- prepare_diversification_data( BAMM_install_directory_path = "./software/bamm-2.5.0/", phylo = Ponerinae_tree, prefix_for_files = "Ponerinae", numberOfGenerations = 10^7, # Set high for optimal run, but will take ages BAMM_output_directory_path = tempdir(), # Can be adjusted such as "./BAMM_outputs/" keep_BAMM_outputs = FALSE, # Adjust if needed ) ## End(Not run) if (deepSTRAPP::is_dev_version()) { # Load directly the result data(Ponerinae_BAMM_object) ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore output str(Ponerinae_BAMM_object, 1) # Plot mean net diversification rates and regime shifts on the phylogeny plot_BAMM_rates(Ponerinae_BAMM_object, labels = FALSE, legend = TRUE) }# ----- Example 1: Whale phylogeny ----- # library(phytools) data(whale.tree) ## Not run: ## You need to install the BAMM C++ software locally prior to run this function # Visit the official BAMM website (\url{http://bamm-project.org/}) for information. # Run BAMM workflow with deepSTRAPP whale_BAMM_object <- prepare_diversification_data( BAMM_install_directory_path = "./software/bamm-2.5.0/", phylo = whale.tree, prefix_for_files = "whale", numberOfGenerations = 100000, # Set low for the example BAMM_output_directory_path = tempdir(), # Can be adjusted such as "./BAMM_outputs/" keep_BAMM_outputs = FALSE, # Adjust if needed ) ## End(Not run) # Load directly the result data(whale_BAMM_object) # Explore output str(whale_BAMM_object, 1) # Plot mean net diversification rates and regime shifts on the phylogeny plot_BAMM_rates(whale_BAMM_object, cex = 0.5, labels = TRUE, legend = TRUE) # ----- Example 2: Ponerinae phylogeny ----- # # Load phylogeny data("Ponerinae_tree", package = "deepSTRAPP") plot(Ponerinae_tree, show.tip.label = FALSE) ## Not run: ## You need to install the BAMM C++ software locally prior to run this function # Visit the official BAMM website (http://bamm-project.org/) for information. # Run BAMM workflow with deepSTRAPP Ponerinae_BAMM_object <- prepare_diversification_data( BAMM_install_directory_path = "./software/bamm-2.5.0/", phylo = Ponerinae_tree, prefix_for_files = "Ponerinae", numberOfGenerations = 10^7, # Set high for optimal run, but will take ages BAMM_output_directory_path = tempdir(), # Can be adjusted such as "./BAMM_outputs/" keep_BAMM_outputs = FALSE, # Adjust if needed ) ## End(Not run) if (deepSTRAPP::is_dev_version()) { # Load directly the result data(Ponerinae_BAMM_object) ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. # Explore output str(Ponerinae_BAMM_object, 1) # Plot mean net diversification rates and regime shifts on the phylogeny plot_BAMM_rates(Ponerinae_BAMM_object, labels = FALSE, legend = TRUE) }
Map trait evolution on a time-calibrated phylogeny in several steps:
Step 1: Fit evolutionary models to trait data using Maximum Likelihood.
Step 2: Select the best fitting model comparing AICc.
Step 3: Infer ancestral characters estimates (ACE) at nodes.
Step 4: Run stochastic mapping simulations to generate evolutionary histories compatible with the best model and inferred ACE. (Only for categorical and biogeographic data)
Step 5: Infer ancestral states along branches.
For continuous traits: use interpolation to produce a contMap.
For categorical and biogeographic data: compute posterior frequencies of each state/range
to produce a densityMap for each state/range.
prepare_trait_data( tip_data, trait_data_type, phylo, seed = NULL, evolutionary_models = NULL, Q_matrix = NULL, BioGeoBEARS_directory_path = NULL, keep_BioGeoBEARS_files = TRUE, prefix_for_files = NULL, nb_cores = 1, max_range_size = 2, split_multi_area_ranges = FALSE, ..., res = 100, nb_simulations = 1000, color_scale = NULL, colors_per_levels = NULL, plot_map = TRUE, plot_overlay = TRUE, add_ACE_pies = TRUE, PDF_file_path = NULL, return_ace = TRUE, return_BSM = FALSE, return_simmaps = FALSE, return_best_model_fit = FALSE, return_model_selection_df = FALSE, verbose = TRUE )prepare_trait_data( tip_data, trait_data_type, phylo, seed = NULL, evolutionary_models = NULL, Q_matrix = NULL, BioGeoBEARS_directory_path = NULL, keep_BioGeoBEARS_files = TRUE, prefix_for_files = NULL, nb_cores = 1, max_range_size = 2, split_multi_area_ranges = FALSE, ..., res = 100, nb_simulations = 1000, color_scale = NULL, colors_per_levels = NULL, plot_map = TRUE, plot_overlay = TRUE, add_ACE_pies = TRUE, PDF_file_path = NULL, return_ace = TRUE, return_BSM = FALSE, return_simmaps = FALSE, return_best_model_fit = FALSE, return_model_selection_df = FALSE, verbose = TRUE )
tip_data |
Named numerical or character string vector of trait values/states/ranges at tips.
Names should be ordered as the tip labels in the phylogeny found in |
trait_data_type |
Character string. Type of trait data. Either: "continuous", "categorical" or "biogeographic". |
phylo |
Time-calibrated phylogeny. Object of class |
seed |
Integer. Set the seed to ensure reproducibility. Default is |
evolutionary_models |
(Vector of) character string(s). To provide the set of evolutionary models to fit on the data.
|
Q_matrix |
Custom Q-matrix for categorical data representing transition classes between states.
Transitions with similar integers are estimated with a shared rate parameter.
Transitions with |
BioGeoBEARS_directory_path |
Character string. The path to the directory used to store input/output files generated for/by BioGeoBEARS during biogeographic historical inferences. Use '/' to separate directory and sub-directories. It must end with '/'. Only for biogeographic data. |
keep_BioGeoBEARS_files |
Logical. Whether the |
prefix_for_files |
Character string. Prefix to add to all BioGeoBEARS files stored in the |
nb_cores |
Interger. Number of cores to use for parallel computation during BioGeoBEARS runs. Default = |
max_range_size |
Integer. Maximum number of unique areas encompassed by multi-range areas. Default = |
split_multi_area_ranges |
Logical. Whether to split multi-area ranges across unique areas when mapping ranges.
Ex: For range EW, posterior probabilities will be split equally between Eastern Palearctic (E) and Western Palearctic (W).
Default = |
... |
Additional arguments to be passed down to the functions used to fit models (See |
res |
Integer. Define the number of time steps used to interpolate/estimate trait value/state/range in |
nb_simulations |
Integer. Define the number of simulations generated for stochastic mapping. Default = |
color_scale |
Vector of character string. List of colors to use to build the color scale with |
colors_per_levels |
Named character string. To set the colors to use to map each state/range posterior probabilities. Names = states/ranges; values = colors.
If |
plot_map |
Logical. Whether to plot or not the phylogeny with mapped trait evolution. Default = |
plot_overlay |
Logical. If |
add_ACE_pies |
Logical. Whether to add pies of posterior probabilities of states/ranges at internal nodes on the mapped phylogeny. Default = |
PDF_file_path |
Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf". |
return_ace |
Logical. Whether the named vector of ancestral characters estimates (ACE) at internal nodes should be returned in the output. Default = |
return_BSM |
Logical. (Only for Biogeographic data) Whether the summary tables of anagenetic and cladogenetic events generated during the Biogeographic Stochastic Mapping (BSM)
process should be returned in the output. Default = |
return_simmaps |
Logical. Whether the evolutionary histories simulated during stochastic mapping (i.e., |
return_best_model_fit |
Logical. Whether to include the output of the best fitting model in the function output. Default = |
return_model_selection_df |
Logical. Whether to include the data.frame summarizing model comparisons used to select the best fitting model should be returned in the output. Default = |
verbose |
Logical. Should progression be displayed? A message will be printed for every steps in the process. Default is |
Map trait evolution on a time-calibrated phylogeny in several steps:
Step 1: Models are fit using Maximum Likelihood approach:
For "continuous" data models are fit with geiger::fitContinuous(): "BM", "OU", "EB", "rate_trend", "lambda", "kappa", "delta". Default is "BM".
For "categorical" data models are fit with geiger::fitDiscrete(): "ER", "SYM", "ARD", "meristic", "matrix". Default is "ARD".
For "biogeographic" data models are fit with R package BioGeoBEARS: "BAYAREALIKE", "DIVALIKE", "DEC", "BAYAREALIKE+J", "DIVALIKE+J", "DEC+J". Default is "DEC".
Step 2: Best model is identified among the list of evolutionary_models by comparing the corrected AIC (AICc)
and selecting the model with lowest AICc.
Step 3: For continuous traits: Ancestral characters estimates (ACE) are inferred with phytools::fastAnc on a tree
with modified branch lengths scaled to reflect the evolutionary rates estimated from the best model using phytools::rescale().
Step 4: Stochastic Mapping.
For categorical and biogeographic data, stochastic mapping simulations are performed to generate evolutionary histories compatible with the best model and inferred ACE. Node states/ranges are drawn from the scaled marginal likelihoods of ACE, and states/ranges shifts along branches are simulated according to the transition matrix Q estimated from the best fitting model.
Step 5: Infer ancestral states along branches.
For continuous traits: ancestral trait values along branches are interpolated with phytools::contMap().
This provides quick estimates of trait value at any point in time, but it does not provide accurate ML estimates in
case of models that are time or trait-value dependent (such as "EB" or "OU") as the interpolation used to built the contMap is assuming
a constant rate along each branch. However, ancestral trait values at nodes remain accurate
For categorical and biogeographic data: compute posterior frequencies of each state/range among the simulated evolutionary histories (simmaps)
to produce a densityMap for each state/range that reflects the changes along branches in probability of harboring a given state/range.
The function returns a list with at least two elements.
$contMap (For "continuous" data) Object of class "contMap", typically generated with phytools::contMap(),
that contains a phylogenetic tree and associated continuous trait mapping.
$densityMaps (For "categorical" and "biogeographic" data) List of objects of class "densityMap,
typically generated with phytools::densityMap(), that contains a phylogenetic tree and associated mapping of probability
to harbor a given state/range along branches. The list contains one "densityMap per state/range found in the tip_data.
$densityMaps_all_ranges (For "biogeographic" data only, if split_multi_area_ranges = TRUE) Same as $densityMaps,
but for all ranges including the multi-areas ranges (e.g., AB) while $densityMaps will display posterior probabilities for
unique areas only (e.g., A and B), with multi-areas ranges split across the unique areas they encompass.
$trait_data_type Character string. Record the type of trait data. Either: "continuous", "categorical" or "biogeographic".
If return_ace = TRUE,
$ace For continuous traits: Named vector that record the ancestral characters estimates (ACE) at internal nodes.
For categorical and biogeographic data: Matrix that record the posterior probabilities of ancestral states/ranges (characters) estimates (ACE) at internal nodes.
Rows are internal nodes. Columns are states/ranges. Values are posterior probabilities of each state per node.
$ace_all_ranges For biogeographic data, if split_multi_area_ranges = TRUE: Named vector that record the ancestral characters estimates (ACE) at internal nodes,
but including all ranges observed in the simmaps, including the multi-areas ranges (e.g., AB), while $ace will display posterior probabilities
for unique areas only (e.g., A and B), with multi-areas ranges split across the unique areas they encompass.
If return_BSM = TRUE, (Only for biogeographic data)
$BSM_output List of two lists that contains summary information of cladogenetic ($RES_caldo_events_tables) and anagenetic ($RES_ana_events_tables) events
recording across the N simulations of biogeographic histories performed during Biogeographic Stochastic Mapping (BSM).
Each element of the list is a data.frame recording events occurring during one simulation.
If return_simmaps = TRUE, (Only for categorical and biogeographic data)
$simmaps List that contains as many objects of class "simmap" that nb_simulations were requested.
Each simmap object is a phylogeny with one simulated geographic history (i.e., transitions in geographic ranges) mapped along branches.
If return_best_model_fit = TRUE,
$best_model_fit List that provides the output of the best fitting model.
If model_selection_df = TRUE,
$model_selection_df Data.frame that summarizes model comparisons used to select the best fitting model.
For biogeographic data, the function also produce input and output files associated with BioGeoBEARS and stored in the directory
specified in BioGeoBEARS_directory_path. The directory and its content are kept if keep_BioGeoBEARS_files = TRUE
This function provides an easy solution to map trait evolution on a time-calibrated phylogeny
and obtain the contMap/densityMaps objects needed to run the deepSTRAPP workflow (run_deepSTRAPP_for_focal_time, run_deepSTRAPP_over_time).
However, it does not explore the most complex options for trait evolution. You may need to explore more complex models to capture the dynamics of trait evolution.
such as trait-dependent multi-rate models (phytools::brownie.lite(), OUwie::OUwie), Bayesian MCMC implementations allowing a thorough exploration
of location and number of regime shifts (Ex: BayesTraits, RevBayes), or RRphylo for a penalized phylogenetic ridge regression approach that allows regime shifts across all branches.
This function provides an easy solution to infer ancestral geographic ranges using the BioGeoBEARS framework. It allows to directly compare model fits across 6 models: DEC, DEC+J, DIVALIKE, DIVALIKE+J, BAYAREALIKE, BAYAREALIKE+J. It uses a step-wise approach by using MLE estimates of previous runs as staring parameter values when increasing complexity (adding +J parameters). However, it does not explore the most complex options for historical biogeography. You may need to explore more complex models to capture the dynamics of range evolution such as time-stratification with adjacency matrices, dispersal multipliers (+W), distance-based dispersal probabilities (+X), or other features. See for instance, http://phylo.wikidot.com/biogeobears).
The R package BioGeoBEARS is needed for this function to work with biogeographic data.
Please install it manually from: https://github.com/nmatzke/BioGeoBEARS.
Maël Doré
For macroevolutionary models in geiger: Pennell, M. W., Eastman, J. M., Slater, G. J., Brown, J. W., Uyeda, J. C., FitzJohn, R. G., ... & Harmon, L. J. (2014). geiger v2. 0: an expanded suite of methods for fitting macroevolutionary models to phylogenetic trees. Bioinformatics, 30(15), 2216-2218. doi:10.1093/bioinformatics/btu181.
For BioGeoBEARS: Matzke, Nicholas J. (2018). BioGeoBEARS: BioGeography with Bayesian (and likelihood) Evolutionary Analysis with R Scripts. version 1.1.1, published on GitHub on November 6, 2018. doi:10.5281/zenodo.1478250. Website: http://phylo.wikidot.com/biogeobears.
geiger::fitContinuous() geiger::fitDiscrete() phytools::contMap() phytools::densityMap()
For a guided tutorial, see the associated vignettes:
For continuous trait data: vignette("model_continuous_trait_evolution", package = "deepSTRAPP")
For categorical trait data: vignette("model_categorical_trait_evolution", package = "deepSTRAPP")
For biogeographic range data: vignette("model_biogeographic_range_evolution", package = "deepSTRAPP")
# ----- Example 1: Continuous data ----- # ## Load phylogeny and tip data # Load eel phylogeny and tip data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 data("eel.tree", package = "phytools") data("eel.data", package = "phytools") # Extract body size eel_data <- stats::setNames(eel.data$Max_TL_cm, rownames(eel.data)) # (May take several minutes to run) ## Map trait evolution on the phylogeny mapped_cont_traits <- prepare_trait_data( tip_data = eel_data, trait_data_type = "continuous", phylo = eel.tree, evolutionary_models = c("BM", "OU", "lambda", "kappa"), # Example of an additional argument ('control') that can be provided to geiger::fitContinuous() control = list(niter = 200), color_scale = c("darkgreen", "limegreen", "orange", "red"), plot_map = FALSE, return_best_model_fit = TRUE, return_model_selection_df = TRUE, verbose = TRUE) ## Explore output plot_contMap(mapped_cont_traits$contMap) # contMap with interpolated trait values mapped_cont_traits$model_selection_df # Summary of model selection # Parameter estimates and optimization summary of the best model # (Here, the best model is Pagel's lambda) mapped_cont_traits$best_model_fit$opt mapped_cont_traits$ace # Ancestral character estimates at internal nodes # ----- Example 2: Categorical data ----- # ## Load phylogeny and tip data # Load eel phylogeny and tip data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 data("eel.tree", package = "phytools") data("eel.data", package = "phytools") # Transform feeding mode data into a 3-level factor eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) eel_data <- as.character(eel_data) eel_data[c(1, 5, 6, 7, 10, 11, 15, 16, 17, 24, 25, 28, 30, 51, 52, 53, 55, 58, 60)] <- "kiss" eel_data <- stats::setNames(eel_data, rownames(eel.data)) table(eel_data) # Manually define a Q_matrix for rate classes of state transition to use in the 'matrix' model # Does not allow transitions from state 1 ("bite") to state 2 ("kiss") or state 3 ("suction") # Does not allow transitions from state 3 ("suction") to state 1 ("bite") # Set symmetrical rates between state 2 ("kiss") and state 3 ("suction") Q_matrix = rbind(c(NA, 0, 0), c(1, NA, 2), c(0, 2, NA)) # Set colors per states colors_per_states <- c("limegreen", "orange", "dodgerblue") names(colors_per_states) <- c("bite", "kiss", "suction") # (May take several minutes to run) ## Run evolutionary models eel_cat_3lvl_data <- prepare_trait_data(tip_data = eel_data, phylo = eel.tree, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = c("ER", "SYM", "ARD", "meristic", "matrix"), Q_matrix = Q_matrix, nb_simulations = 1000, plot_map = TRUE, plot_overlay = TRUE, return_best_model_fit = TRUE, return_model_selection_df = TRUE) # Load directly output data(eel_cat_3lvl_data, package = "deepSTRAPP") ## Explore output plot(eel_cat_3lvl_data$densityMaps[[1]]) # densityMap for state n°1 ("bite") eel_cat_3lvl_data$model_selection_df # Summary of model selection # Parameter estimates and optimization summary of the best model # (Here, the best model is ER) print(eel_cat_3lvl_data$best_model_fit)$ # Summary of the best evolutionary model eel_cat_3lvl_data$ace # Posterior probabilities of each state (= ACE) at internal nodes # ----- Example 3: Biogeographic data ----- # if (deepSTRAPP::is_dev_version()) { ## The R package 'BioGeoBEARS' is needed for this function to work with biogeographic data. # Please install it manually from: https://github.com/nmatzke/BioGeoBEARS. ## Load phylogeny and tip data # Load eel phylogeny and tip data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 data("eel.tree", package = "phytools") data("eel.data", package = "phytools") # Transform feeding mode data into biogeographic data with ranges A, B, and AB. eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) eel_data <- as.character(eel_data) eel_data[eel_data == "bite"] <- "A" eel_data[eel_data == "suction"] <- "B" eel_data[c(5, 6, 7, 15, 25, 32, 33, 34, 50, 52, 57, 58, 59)] <- "AB" eel_data <- stats::setNames(eel_data, rownames(eel.data)) table(eel_data) colors_per_ranges <- c("dodgerblue3", "gold") names(colors_per_ranges) <- c("A", "B") # (May take several minutes to run) ## Run evolutionary models eel_biogeo_data <- prepare_trait_data( tip_data = eel_data, trait_data_type = "biogeographic", phylo = eel.tree, # Default = "DEC" for biogeographic evolutionary_models = c("BAYAREALIKE", "DIVALIKE", "DEC", "BAYAREALIKE+J", "DIVALIKE+J", "DEC+J"), BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "eel", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to display the two outputs # Reduce the number of Stochastic Mapping simulations to save time (Default = '1000') nb_simulations = 100, colors_per_levels = colors_per_ranges, return_simmaps = TRUE, return_best_model_fit = TRUE, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(eel_biogeo_data, package = "deepSTRAPP") ## Explore output str(eel_biogeo_data, 1) eel_biogeo_data$model_selection_df # Summary of model selection # Parameter estimates and optimization summary of the best model # (Here, the best model is DEC+J) eel_biogeo_data$best_model_fit$optim_result # Posterior probabilities of each state (= ACE) at internal nodes eel_biogeo_data$ace # Only with unique areas eel_biogeo_data$ace_all_ranges # Including multi-area ranges (Here, AB) ## Plot densityMaps # densityMap for range n°1 ("A") plot(eel_biogeo_data$densityMaps[[1]]) # densityMaps with all unique areas overlaid plot_densityMaps_overlay(eel_biogeo_data$densityMaps) # densityMaps with all ranges (including multi-area ranges) overlaid plot_densityMaps_overlay(eel_biogeo_data$densityMaps_all_ranges) }# ----- Example 1: Continuous data ----- # ## Load phylogeny and tip data # Load eel phylogeny and tip data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 data("eel.tree", package = "phytools") data("eel.data", package = "phytools") # Extract body size eel_data <- stats::setNames(eel.data$Max_TL_cm, rownames(eel.data)) # (May take several minutes to run) ## Map trait evolution on the phylogeny mapped_cont_traits <- prepare_trait_data( tip_data = eel_data, trait_data_type = "continuous", phylo = eel.tree, evolutionary_models = c("BM", "OU", "lambda", "kappa"), # Example of an additional argument ('control') that can be provided to geiger::fitContinuous() control = list(niter = 200), color_scale = c("darkgreen", "limegreen", "orange", "red"), plot_map = FALSE, return_best_model_fit = TRUE, return_model_selection_df = TRUE, verbose = TRUE) ## Explore output plot_contMap(mapped_cont_traits$contMap) # contMap with interpolated trait values mapped_cont_traits$model_selection_df # Summary of model selection # Parameter estimates and optimization summary of the best model # (Here, the best model is Pagel's lambda) mapped_cont_traits$best_model_fit$opt mapped_cont_traits$ace # Ancestral character estimates at internal nodes # ----- Example 2: Categorical data ----- # ## Load phylogeny and tip data # Load eel phylogeny and tip data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 data("eel.tree", package = "phytools") data("eel.data", package = "phytools") # Transform feeding mode data into a 3-level factor eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) eel_data <- as.character(eel_data) eel_data[c(1, 5, 6, 7, 10, 11, 15, 16, 17, 24, 25, 28, 30, 51, 52, 53, 55, 58, 60)] <- "kiss" eel_data <- stats::setNames(eel_data, rownames(eel.data)) table(eel_data) # Manually define a Q_matrix for rate classes of state transition to use in the 'matrix' model # Does not allow transitions from state 1 ("bite") to state 2 ("kiss") or state 3 ("suction") # Does not allow transitions from state 3 ("suction") to state 1 ("bite") # Set symmetrical rates between state 2 ("kiss") and state 3 ("suction") Q_matrix = rbind(c(NA, 0, 0), c(1, NA, 2), c(0, 2, NA)) # Set colors per states colors_per_states <- c("limegreen", "orange", "dodgerblue") names(colors_per_states) <- c("bite", "kiss", "suction") # (May take several minutes to run) ## Run evolutionary models eel_cat_3lvl_data <- prepare_trait_data(tip_data = eel_data, phylo = eel.tree, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = c("ER", "SYM", "ARD", "meristic", "matrix"), Q_matrix = Q_matrix, nb_simulations = 1000, plot_map = TRUE, plot_overlay = TRUE, return_best_model_fit = TRUE, return_model_selection_df = TRUE) # Load directly output data(eel_cat_3lvl_data, package = "deepSTRAPP") ## Explore output plot(eel_cat_3lvl_data$densityMaps[[1]]) # densityMap for state n°1 ("bite") eel_cat_3lvl_data$model_selection_df # Summary of model selection # Parameter estimates and optimization summary of the best model # (Here, the best model is ER) print(eel_cat_3lvl_data$best_model_fit)$ # Summary of the best evolutionary model eel_cat_3lvl_data$ace # Posterior probabilities of each state (= ACE) at internal nodes # ----- Example 3: Biogeographic data ----- # if (deepSTRAPP::is_dev_version()) { ## The R package 'BioGeoBEARS' is needed for this function to work with biogeographic data. # Please install it manually from: https://github.com/nmatzke/BioGeoBEARS. ## Load phylogeny and tip data # Load eel phylogeny and tip data from the R package phytools # Source: Collar et al., 2014; DOI: 10.1038/ncomms6505 data("eel.tree", package = "phytools") data("eel.data", package = "phytools") # Transform feeding mode data into biogeographic data with ranges A, B, and AB. eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) eel_data <- as.character(eel_data) eel_data[eel_data == "bite"] <- "A" eel_data[eel_data == "suction"] <- "B" eel_data[c(5, 6, 7, 15, 25, 32, 33, 34, 50, 52, 57, 58, 59)] <- "AB" eel_data <- stats::setNames(eel_data, rownames(eel.data)) table(eel_data) colors_per_ranges <- c("dodgerblue3", "gold") names(colors_per_ranges) <- c("A", "B") # (May take several minutes to run) ## Run evolutionary models eel_biogeo_data <- prepare_trait_data( tip_data = eel_data, trait_data_type = "biogeographic", phylo = eel.tree, # Default = "DEC" for biogeographic evolutionary_models = c("BAYAREALIKE", "DIVALIKE", "DEC", "BAYAREALIKE+J", "DIVALIKE+J", "DEC+J"), BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "eel", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to display the two outputs # Reduce the number of Stochastic Mapping simulations to save time (Default = '1000') nb_simulations = 100, colors_per_levels = colors_per_ranges, return_simmaps = TRUE, return_best_model_fit = TRUE, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(eel_biogeo_data, package = "deepSTRAPP") ## Explore output str(eel_biogeo_data, 1) eel_biogeo_data$model_selection_df # Summary of model selection # Parameter estimates and optimization summary of the best model # (Here, the best model is DEC+J) eel_biogeo_data$best_model_fit$optim_result # Posterior probabilities of each state (= ACE) at internal nodes eel_biogeo_data$ace # Only with unique areas eel_biogeo_data$ace_all_ranges # Including multi-area ranges (Here, AB) ## Plot densityMaps # densityMap for range n°1 ("A") plot(eel_biogeo_data$densityMaps[[1]]) # densityMaps with all unique areas overlaid plot_densityMaps_overlay(eel_biogeo_data$densityMaps) # densityMaps with all ranges (including multi-area ranges) overlaid plot_densityMaps_overlay(eel_biogeo_data$densityMaps_all_ranges) }
Wrapper function to run deepSTRAPP workflow for a given point in the past (i.e. the focal_time).
It starts from traits mapped on a phylogeny (trait data) and BAMM output (diversification data)
and carries out the appropriate statistical method to test for a relationship between diversification rates and trait data.
Tests are based on block-permutations: rates data are randomized across tips following blocks
defined by the diversification regimes identified on each tip (typically from a BAMM).
Such tests are called STructured RAte Permutations on Phylogenies (STRAPP) as described in Rabosky, D. L., & Huang, H. (2016). A robust semi-parametric test for detecting trait-dependent diversification. Systematic biology, 65(2), 181-193. doi:10.1093/sysbio/syv066.
See the original BAMMtools::traitDependentBAMM() function used to
carry out STRAPP test on extant time-calibrated phylogenies.
Tests can be carried out on speciation, extinction and net diversification rates.
run_deepSTRAPP_for_focal_time( contMap = NULL, densityMaps = NULL, ace = NULL, tip_data = NULL, trait_data_type, BAMM_object, focal_time, keep_tip_labels = TRUE, rate_type = "net_diversification", seed = NULL, nb_permutations = NULL, replace_samples = FALSE, alpha = 0.05, two_tailed = TRUE, one_tailed_hypothesis = NULL, posthoc_pairwise_tests = FALSE, p.adjust_method = "none", return_perm_data = FALSE, nthreads = 1, print_hypothesis = TRUE, extract_diversification_data_melted_df = FALSE, return_updated_trait_data_with_Map = FALSE, return_updated_BAMM_object = FALSE, verbose = TRUE )run_deepSTRAPP_for_focal_time( contMap = NULL, densityMaps = NULL, ace = NULL, tip_data = NULL, trait_data_type, BAMM_object, focal_time, keep_tip_labels = TRUE, rate_type = "net_diversification", seed = NULL, nb_permutations = NULL, replace_samples = FALSE, alpha = 0.05, two_tailed = TRUE, one_tailed_hypothesis = NULL, posthoc_pairwise_tests = FALSE, p.adjust_method = "none", return_perm_data = FALSE, nthreads = 1, print_hypothesis = TRUE, extract_diversification_data_melted_df = FALSE, return_updated_trait_data_with_Map = FALSE, return_updated_BAMM_object = FALSE, verbose = TRUE )
contMap |
For continuous trait data. Object of class |
densityMaps |
For categorical trait or biogeographic data. List of objects of class |
ace |
(Optional) Ancestral Character Estimates (ACE) at the internal nodes.
Obtained with
|
tip_data |
(Optional) Named vector of tip values of the trait.
|
trait_data_type |
Character string. Specify the type of trait data. Must be one of "continuous", "categorical", "biogeographic". |
BAMM_object |
Object of class |
focal_time |
Numerical. The time, in terms of time distance from the present, at which data must be extracted and the phylogeny and mappings must be cut. It must be smaller than the root age of the phylogeny. |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip
must retained their initial |
rate_type |
A character string specifying the type of diversification rates to use. Must be one of 'speciation', 'extinction' or 'net_diversification' (default). |
seed |
Integer. Set the seed to ensure reproducibility. Default is |
nb_permutations |
Integer. To select the number of random permutations to perform during the tests. If NULL (default), all posterior samples will be used once. |
replace_samples |
Logical. To specify whether to allow 'replacement' (i.e., multiple use) of a posterior sample
when drawing samples used to carry out the STRAPP test. Default is |
alpha |
Numerical. Significance level to use to compute the |
two_tailed |
Logical. To define the type of tests. If
|
one_tailed_hypothesis |
A character string specifying the alternative hypothesis in the one-tailed test. For continuous data, it is either "negative" or "positive" correlation. For binary data, it lists the trait states with states ordered in increasing rates under the alternative hypothesis, separated by a greater-than such as c('A > B'). |
posthoc_pairwise_tests |
Logical. Only for multinominal data (with more than two states). If |
p.adjust_method |
A character string. Only for multinominal data (with more than two states). It specifies the type of correction to apply to the p-values
in the post hoc pairwise tests to account for multiple comparisons. See |
return_perm_data |
Logical. Whether to return the stats data computed from the posterior samples for observed and permuted data in the output.
This is needed to plot the histogram of the null distribution used to assess significance of the test with |
nthreads |
Integer. Number of threads to use for paralleled computing of the STRAPP tests across the permutations.
The R package |
print_hypothesis |
Logical. Whether to print information on what test is carried out, detailing the null and alternative hypotheses,
and what significant level is used to rejected or not the null hypothesis. Default is |
extract_diversification_data_melted_df |
Logical. Specify whether diversification data (regimes ID and tip rates) must be extracted from the |
return_updated_trait_data_with_Map |
Logical. Specify whether the |
return_updated_BAMM_object |
Logical. Specify whether the |
verbose |
Logical. Should progression be displayed? A message will be printed at each stepof the deepSTRAPP workflow,
and for every batch of 100 BAMM posterior samples whose rates are regimes are updated, and optionally extracted in a melted data.frame
(if |
The function encapsulates several functions carrying out each step of the deepSTRAPP workflow:
extract_most_likely_trait_values_for_focal_time() extracts the most likely trait values
found along branches at the focal_time.
Optionally, the function can update the mapped phylogeny (contMap/densityMaps) such as
branches overlapping the focal_time are shorten to the focal_time, and
the trait mapping for the cut off branches are removed
by updating the $tree$maps and $tree$mapped.edge elements.
update_rates_and_regimes_for_focal_time() updates the BAMM_object to obtain
the diversification rates/regimes found along branches the focal_time.
Optionally, the function can update the BAMM_object to display a mapped phylogeny
such as branches overlapping the focal_time are shorten to the focal_time
If requested (extract_diversification_data_melted_df = TRUE), extract_diversification_data_melted_df_for_focal_time()
will be used to extract regimes ID and tip rates from the updated_BAMM_object and provide a melted data.frame summarizing the diversification data
as found on the phylogeny for the focal_time.
compute_STRAPP_test_for_focal_time() carries out the appropriate statistical method to test for
a relationship between diversification rates and trait data for a given point in the past (i.e. the focal_time).
It can handle three types of statistical tests depending on the type of trait data provided:
Continuous trait data: Test for correlations with the Spearman's rank correlation test (See stats::cor.test).
Binary trait data (two states): Test for differences in rates between states with the Mann-Whitney-Wilcoxon rank-sum test (See stats::wilcox.test).
Multinominal trait data (More than two states): Test for differences in rates across all states with the Kruskal-Wallis H test (See stats::kruskal.test).
If posthoc_pairwise_tests = TRUE, Dunn's post hoc pairwise rank-sum tests between pairs of states will be carried out too (See dunn.test::dunn.test).
The function returns a list with at least two elements.
$STRAPP_results List with at least eight elements summarizing the results of the STRAPP tests.
See compute_STRAPP_test_for_focal_time() for a detailed description of the output.
$focal_time Integer. The time, in terms of time distance from the present, at which the data were extracted and the STRAPP test carried out.
Optional formatted output:
$diversification_data_df A data.frame with six columns summarizing the diversification data as found on the phylogeny for the focal_time.
See extract_diversification_data_melted_df_for_focal_time() for a detailed description of the output.
Optional data updated for the focal_time:
$updated_trait_data_with_Map A list with four elements that contains trait data found at the focal_time and an updated contMap or densityMaps
that can be used as input of plot_contMap() or plot_densityMaps_overlay() to display a phylogeny mapped with trait values/states/ranges with branches cut at the focal_time.
See extract_most_likely_trait_values_for_focal_time() for a detailed description of the output.
$updated_BAMM_object An updated BAMM_object of class "bammdata" that contains rates and regimes ID found at the focal_time.
Can be used as input of plot_BAMM_rates() to display a phylogeny mapped with diversification rates with branches cut at the focal_time.
See update_rates_and_regimes_for_focal_time() for a detailed description of the output.
Maël Doré
extract_most_likely_trait_values_for_focal_time() update_rates_and_regimes_for_focal_time()
extract_diversification_data_melted_df_for_focal_time() compute_STRAPP_test_for_focal_time()
For a guided tutorial on complete deepSTRAPP workflow, see the associated vignettes:
For continuous trait data: vignette("deepSTRAPP_continuous_data", package = "deepSTRAPP")
For categorical trait data: vignette("deepSTRAPP_categorical_3lvl_data", package = "deepSTRAPP")
For biogeographic range data: vignette("deepSTRAPP_biogeographic_data", package = "deepSTRAPP")
if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # (May take several minutes to run) # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. deepSTRAPP_output <- run_deepSTRAPP_for_focal_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(deepSTRAPP_output, max.level = 1) # Access deepSTRAPP results str(deepSTRAPP_output$STRAPP_results) # Access trait data head(deepSTRAPP_output$updated_trait_data_with_Map$trait_data) # Access the diversification data in a melted data.frame head(deepSTRAPP_output$diversification_data_df) # Plot rates vs. trait values across branches plot_rates_vs_trait_data_for_focal_time(deepSTRAPP_output) # Plot updated contMap plot_contMap(deepSTRAPP_output$updated_trait_data_with_Map$contMap) ape::nodelabels(text = deepSTRAPP_output$updated_trait_data_with_Map$contMap$tree$initial_nodes_ID) # Plot diversification rates on updated phylogeny plot_BAMM_rates(deepSTRAPP_output$updated_BAMM_object, labels = TRUE) # Plot histogram of test stats plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output) # ----- Example 2: Categorical trait ----- # ## Load data # Load phylogeny data(Ponerinae_tree, package = "deepSTRAPP") # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an equal-rates (ER) Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = "ARD", # Use default ARD model nb_simulations = 100, # Reduce number of simulations to save time seed = 1234, # Seet seed for reproducibility return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. deepSTRAPP_output <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_cat_data_old_calib$densityMaps, ace = Ponerinae_cat_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(deepSTRAPP_output, max.level = 1) # Access deepSTRAPP results str(deepSTRAPP_output$STRAPP_results, max.level = 2) # Result for overall Kruskal-Wallis test deepSTRAPP_output$STRAPP_results[1:3] # Results for posthoc pairwise Dunn's tests deepSTRAPP_output$STRAPP_results$posthoc_pairwise_tests$summary_df # Access trait data head(deepSTRAPP_output$updated_trait_data_with_Map$trait_data) # Access the diversification data in a melted data.frame head(deepSTRAPP_output$diversification_data_df) # Plot rates vs. states across branches plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output, colors_per_levels = colors_per_states) # Plot updated densityMaps cut at focal time plot_densityMaps_overlay(deepSTRAPP_output$updated_trait_data_with_Map$densityMaps) # Plot diversification rates on updated phylogeny plot_BAMM_rates(BAMM_object = deepSTRAPP_output$updated_BAMM_object, legend = TRUE, labels = FALSE, colorbreaks = deepSTRAPP_output$updated_BAMM_object$initial_colorbreaks$net_diversification) # Plot histogram of Kruskal-Wallis overall test stats plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output) # Plot histograms of posthoc pairwise Dunn's test stats plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output, plot_posthoc_tests = TRUE) # ----- Example 3: Biogeographic ranges ----- # ## Load data # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load trait df data(Ponerinae_binary_range_table, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare range data for Old World vs. New World # No overlap in ranges table(Ponerinae_binary_range_table$Old_World, Ponerinae_binary_range_table$New_World) Ponerinae_NO_data <- stats::setNames(object = Ponerinae_binary_range_table$Old_World, nm = Ponerinae_binary_range_table$Taxa) Ponerinae_NO_data <- as.character(Ponerinae_NO_data) Ponerinae_NO_data[Ponerinae_NO_data == "TRUE"] <- "O" # O = Old World Ponerinae_NO_data[Ponerinae_NO_data == "FALSE"] <- "N" # N = New World names(Ponerinae_NO_data) <- Ponerinae_binary_range_table$Taxa table(Ponerinae_NO_data) colors_per_ranges <- c("mediumpurple2", "peachpuff2") names(colors_per_ranges) <- c("N", "O") # (May take several minutes to run) ## Run evolutionary models Ponerinae_biogeo_data <- prepare_trait_data( tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", phylo = Ponerinae_tree_old_calib, evolutionary_models = "DEC+J", # Default = "DEC" for biogeographic BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "Ponerinae_old_calib", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to display the two outputs nb_simulations = 100, # Reduce to save time (Default = '1000') colors_per_levels = colors_per_ranges, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(Ponerinae_biogeo_data_old_calib, package = "deepSTRAPP") ## Explore output str(Ponerinae_biogeo_data_old_calib, 1) ## Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. deepSTRAPP_output <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_biogeo_data_old_calib$densityMaps, ace = Ponerinae_biogeo_data_old_calib$ace, tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(deepSTRAPP_output, max.level = 1) # Access deepSTRAPP results str(deepSTRAPP_output$STRAPP_results, max.level = 2) # Result for Mann-Whitney-Wilcoxon test deepSTRAPP_output$STRAPP_results[1:3] # Access trait data head(deepSTRAPP_output$updated_trait_data_with_Map$trait_data) # Access the diversification data in a melted data.frame head(deepSTRAPP_output$diversification_data_df) # Plot rates vs. ranges across branches plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output, colors_per_levels = colors_per_ranges) # Plot updated densityMaps cut at focal time plot_densityMaps_overlay(deepSTRAPP_output$updated_trait_data_with_Map$densityMaps) # Plot diversification rates on updated phylogeny plot_BAMM_rates(BAMM_object = deepSTRAPP_output$updated_BAMM_object, legend = TRUE, labels = FALSE, colorbreaks = deepSTRAPP_output$updated_BAMM_object$initial_colorbreaks$net_diversification) # Plot histogram of Mann-Whitney-Wilcoxon test plot_histogram_STRAPP_test_for_focal_time( STRAPP_results = deepSTRAPP_output$STRAPP_results) }if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # (May take several minutes to run) # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set focal time to 10 Mya focal_time <- 10 ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. deepSTRAPP_output <- run_deepSTRAPP_for_focal_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(deepSTRAPP_output, max.level = 1) # Access deepSTRAPP results str(deepSTRAPP_output$STRAPP_results) # Access trait data head(deepSTRAPP_output$updated_trait_data_with_Map$trait_data) # Access the diversification data in a melted data.frame head(deepSTRAPP_output$diversification_data_df) # Plot rates vs. trait values across branches plot_rates_vs_trait_data_for_focal_time(deepSTRAPP_output) # Plot updated contMap plot_contMap(deepSTRAPP_output$updated_trait_data_with_Map$contMap) ape::nodelabels(text = deepSTRAPP_output$updated_trait_data_with_Map$contMap$tree$initial_nodes_ID) # Plot diversification rates on updated phylogeny plot_BAMM_rates(deepSTRAPP_output$updated_BAMM_object, labels = TRUE) # Plot histogram of test stats plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output) # ----- Example 2: Categorical trait ----- # ## Load data # Load phylogeny data(Ponerinae_tree, package = "deepSTRAPP") # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an equal-rates (ER) Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = "ARD", # Use default ARD model nb_simulations = 100, # Reduce number of simulations to save time seed = 1234, # Seet seed for reproducibility return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. deepSTRAPP_output <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_cat_data_old_calib$densityMaps, ace = Ponerinae_cat_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(deepSTRAPP_output, max.level = 1) # Access deepSTRAPP results str(deepSTRAPP_output$STRAPP_results, max.level = 2) # Result for overall Kruskal-Wallis test deepSTRAPP_output$STRAPP_results[1:3] # Results for posthoc pairwise Dunn's tests deepSTRAPP_output$STRAPP_results$posthoc_pairwise_tests$summary_df # Access trait data head(deepSTRAPP_output$updated_trait_data_with_Map$trait_data) # Access the diversification data in a melted data.frame head(deepSTRAPP_output$diversification_data_df) # Plot rates vs. states across branches plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output, colors_per_levels = colors_per_states) # Plot updated densityMaps cut at focal time plot_densityMaps_overlay(deepSTRAPP_output$updated_trait_data_with_Map$densityMaps) # Plot diversification rates on updated phylogeny plot_BAMM_rates(BAMM_object = deepSTRAPP_output$updated_BAMM_object, legend = TRUE, labels = FALSE, colorbreaks = deepSTRAPP_output$updated_BAMM_object$initial_colorbreaks$net_diversification) # Plot histogram of Kruskal-Wallis overall test stats plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output) # Plot histograms of posthoc pairwise Dunn's test stats plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output, plot_posthoc_tests = TRUE) # ----- Example 3: Biogeographic ranges ----- # ## Load data # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load trait df data(Ponerinae_binary_range_table, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare range data for Old World vs. New World # No overlap in ranges table(Ponerinae_binary_range_table$Old_World, Ponerinae_binary_range_table$New_World) Ponerinae_NO_data <- stats::setNames(object = Ponerinae_binary_range_table$Old_World, nm = Ponerinae_binary_range_table$Taxa) Ponerinae_NO_data <- as.character(Ponerinae_NO_data) Ponerinae_NO_data[Ponerinae_NO_data == "TRUE"] <- "O" # O = Old World Ponerinae_NO_data[Ponerinae_NO_data == "FALSE"] <- "N" # N = New World names(Ponerinae_NO_data) <- Ponerinae_binary_range_table$Taxa table(Ponerinae_NO_data) colors_per_ranges <- c("mediumpurple2", "peachpuff2") names(colors_per_ranges) <- c("N", "O") # (May take several minutes to run) ## Run evolutionary models Ponerinae_biogeo_data <- prepare_trait_data( tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", phylo = Ponerinae_tree_old_calib, evolutionary_models = "DEC+J", # Default = "DEC" for biogeographic BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "Ponerinae_old_calib", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to display the two outputs nb_simulations = 100, # Reduce to save time (Default = '1000') colors_per_levels = colors_per_ranges, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(Ponerinae_biogeo_data_old_calib, package = "deepSTRAPP") ## Explore output str(Ponerinae_biogeo_data_old_calib, 1) ## Set focal time to 10 Mya focal_time <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for focal time = 10 Mya. deepSTRAPP_output <- run_deepSTRAPP_for_focal_time( densityMaps = Ponerinae_biogeo_data_old_calib$densityMaps, ace = Ponerinae_biogeo_data_old_calib$ace, tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", BAMM_object = Ponerinae_BAMM_object_old_calib, focal_time = focal_time, rate_type = "net_diversification", return_perm_data = TRUE, extract_diversification_data_melted_df = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE) ## Explore output str(deepSTRAPP_output, max.level = 1) # Access deepSTRAPP results str(deepSTRAPP_output$STRAPP_results, max.level = 2) # Result for Mann-Whitney-Wilcoxon test deepSTRAPP_output$STRAPP_results[1:3] # Access trait data head(deepSTRAPP_output$updated_trait_data_with_Map$trait_data) # Access the diversification data in a melted data.frame head(deepSTRAPP_output$diversification_data_df) # Plot rates vs. ranges across branches plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_output, colors_per_levels = colors_per_ranges) # Plot updated densityMaps cut at focal time plot_densityMaps_overlay(deepSTRAPP_output$updated_trait_data_with_Map$densityMaps) # Plot diversification rates on updated phylogeny plot_BAMM_rates(BAMM_object = deepSTRAPP_output$updated_BAMM_object, legend = TRUE, labels = FALSE, colorbreaks = deepSTRAPP_output$updated_BAMM_object$initial_colorbreaks$net_diversification) # Plot histogram of Mann-Whitney-Wilcoxon test plot_histogram_STRAPP_test_for_focal_time( STRAPP_results = deepSTRAPP_output$STRAPP_results) }
Wrapper function to run deepSTRAPP workflows over multiple time steps in the past.
It starts from traits mapped on a phylogeny (trait data) and BAMM output (diversification data)
and carries out the appropriate statistical method to test for a relationship between diversification rates and trait data.
The workflow is repeated over multiple points in time (i.e. the time_steps) and results are summarized in a data.frame.
The function can also provide summaries of trait values and diversification rates
extracted along branches over the different time_steps.
Statistical tests are based on block-permutations: rates data are randomized across tips following blocks defined by the diversification regimes identified on each tip (typically from a BAMM). Such tests are called STructured RAte Permutations on Phylogenies (STRAPP) as described in Rabosky, D. L., & Huang, H. (2016). A robust semi-parametric test for detecting trait-dependent diversification. Systematic biology, 65(2), 181-193. doi:10.1093/sysbio/syv066.
See the original BAMMtools::traitDependentBAMM() function used to
carry out STRAPP test on extant time-calibrated phylogenies.
Tests can be carried out on speciation, extinction and net diversification rates.
run_deepSTRAPP_over_time( contMap = NULL, densityMaps = NULL, ace = NULL, tip_data = NULL, trait_data_type, BAMM_object, time_steps = NULL, time_range = NULL, nb_time_steps = NULL, time_step_duration = NULL, keep_tip_labels = TRUE, rate_type = "net_diversification", seed = NULL, nb_permutations = NULL, replace_samples = FALSE, alpha = 0.05, two_tailed = TRUE, one_tailed_hypothesis = NULL, posthoc_pairwise_tests = FALSE, p.adjust_method = "none", return_perm_data = FALSE, nthreads = 1, print_hypothesis = TRUE, extract_trait_data_melted_df = FALSE, extract_diversification_data_melted_df = FALSE, return_STRAPP_results = FALSE, return_updated_trait_data_with_Map = FALSE, return_updated_BAMM_object = FALSE, verbose = TRUE, verbose_extended = FALSE )run_deepSTRAPP_over_time( contMap = NULL, densityMaps = NULL, ace = NULL, tip_data = NULL, trait_data_type, BAMM_object, time_steps = NULL, time_range = NULL, nb_time_steps = NULL, time_step_duration = NULL, keep_tip_labels = TRUE, rate_type = "net_diversification", seed = NULL, nb_permutations = NULL, replace_samples = FALSE, alpha = 0.05, two_tailed = TRUE, one_tailed_hypothesis = NULL, posthoc_pairwise_tests = FALSE, p.adjust_method = "none", return_perm_data = FALSE, nthreads = 1, print_hypothesis = TRUE, extract_trait_data_melted_df = FALSE, extract_diversification_data_melted_df = FALSE, return_STRAPP_results = FALSE, return_updated_trait_data_with_Map = FALSE, return_updated_BAMM_object = FALSE, verbose = TRUE, verbose_extended = FALSE )
contMap |
For continuous trait data. Object of class |
densityMaps |
For categorical trait or biogeographic data. List of objects of class |
ace |
(Optional) Ancestral Character Estimates (ACE) at the internal nodes.
Obtained with
|
tip_data |
(Optional) Named vector of tip values of the trait.
|
trait_data_type |
Character string. Specify the type of trait data. Must be one of "continuous", "categorical", "biogeographic". |
BAMM_object |
Object of class |
time_steps |
Numerical vector. Time steps at which the STRAPP tests should be carried out. If |
time_range |
Vector of two numerical values. Time boundaries within with the |
nb_time_steps, time_step_duration
|
Numerical. Number of time steps and duration of each time step used to generate |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip
must retained their initial |
rate_type |
A character string specifying the type of diversification rates to use. Must be one of 'speciation', 'extinction' or 'net_diversification' (default). |
seed |
Integer. Set the seed to ensure reproducibility. Default is |
nb_permutations |
Integer. To select the number of random permutations to perform during the tests. If NULL (default), all posterior samples will be used once. |
replace_samples |
Logical. To specify whether to allow 'replacement' (i.e., multiple use) of a posterior sample
when drawing samples used to carry out the STRAPP test. Default is |
alpha |
Numerical. Significance level to use to compute the |
two_tailed |
Logical. To define the type of tests. If
|
one_tailed_hypothesis |
A character string specifying the alternative hypothesis in the one-tailed test. For continuous data, it is either "negative" or "positive" correlation. For binary data, it lists the trait states with states ordered in increasing rates under the alternative hypothesis, separated by a greater-than such as c('A > B'). |
posthoc_pairwise_tests |
Logical. Only for multinominal data (with more than two states). If |
p.adjust_method |
A character string. Only for multinominal data (with more than two states). It specifies the type of correction to apply to the p-values
in the post hoc pairwise tests to account for multiple comparisons. See |
return_perm_data |
Logical. Whether to return the stats data computed from the posterior samples for observed and permuted data in the output.
This is needed to plot the histograms of the null distribution used to assess significance of the tests with |
nthreads |
Integer. Number of threads to use for paralleled computing of the STRAPP tests across the permutations.
The R package |
print_hypothesis |
Logical. Whether to print information on what test is carried out, detailing the null and alternative hypotheses,
and what significant level is used to rejected or not the null hypothesis. Default is |
extract_trait_data_melted_df |
Logical. Specify whether trait data must be extracted from the |
extract_diversification_data_melted_df |
Logical. Specify whether diversification data (regimes ID and tip rates) must be extracted from the |
return_STRAPP_results |
Logical. Specify whether the |
return_updated_trait_data_with_Map |
Logical. Specify whether the |
return_updated_BAMM_object |
Logical. Specify whether the |
verbose |
Logical. Should progression per |
verbose_extended |
Should progression per |
The function is a wrapper of run_deepSTRAPP_for_focal_time() that runs the
deepSTRAPP workflow over multiple time_steps.
The deepSTRAPP workflow is described step by step in the run_deepSTRAPP_for_focal_time() documentation.
Its main output is the $pvalues_summary_df: a data.frame providing test stat estimates and p-values obtained across all time_steps,
that can be passed down to plot_STRAPP_pvalues_over_time() to generate a plot showing the evolution of the test results across time.
If using multinominal data (with more than two states) and posthoc_pairwise_tests = TRUE, the output will also contain
a data.frame providing test stat estimates and p-values for post hoc pairwise tests in $pvalues_summary_df_for_posthoc_pairwise_tests.
The function offers options to generate summary data.frames of the data extracted across time_steps:
If extract_trait_data_melted_df = TRUE, a data.frame of trait values found along branches at each time step
is provided in $trait_data_df_over_time.
If extract_diversification_data_melted_df = TRUE, a data.frame of diversification data (regimes ID and tip rates)
found along branches at each time step is provided in $diversification_data_df_over_time.
Those data.frames can be passed down to plot_rates_through_time() to generate a plot showing
the evolution diversification rates across trait values over time.
The function also allows to keep records of the intermediate objects generated during the STRAPP workflow:
If return_STRAPP_results = TRUE, a list of STRAPP test outputs is provided in $STRAPP_results_over_time.
Combined with return_perm_data = TRUE, it allows to plot the histograms of the null distributions
used to assess significance of the tests with plot_histogram_STRAPP_test_for_focal_time().
(for a single focal_time) and plot_histograms_STRAPP_tests_over_time() (for multiple time_steps).
If return_updated_trait_data_with_Map = TRUE, a list of objects containing trait data and updated contMap or densityMaps
is provided in $updated_trait_data_with_Map_over_time. Updated contMap/densityMaps can be respectively plotted with plot_contMap()
or plot_densityMaps_overlay(), to display a phylogeny mapped with trait values with branches cut at each focal_time.
If return_updated_BAMM_object = TRUE, a list of updated BAMM_object of class "bammdata" that contains rates and regimes ID
found at each focal_time. Updated BAMM_object can be plotted with plot_BAMM_rates() to display
a phylogeny mapped with diversification rates with branches cut at each focal_time.
The function returns a list with at least five elements.
$pvalues_summary_df Data.frame with three columns providing test stat $estimate and $p_value obtained for each time step (i.e., $focal_time),
that can be passed down to plot_STRAPP_pvalues_over_time() to generate a plot showing the evolution of the test results across time.
$time_steps Numerical vector. Time steps at which the STRAPP tests were carried out in the same order as the objects returned in the output lists.
$trait_data_type Character string. Specify the type of trait data. Possible values are: "continuous", "categorical", "biogeographic".
$trait_data_type_for_stats Character string. The type of trait data used to select statistical method. One of 'continuous', 'binary', or 'multinominal'.
$rate_type Character string. The type of diversification rates used in the tests: 'speciation', 'extinction' or 'net_diversification'.
Optional summary df for multinominal data, if posthoc_pairwise_tests = TRUE:
$pvalues_summary_df_for_posthoc_pairwise_tests Data.frame with four or five columns providing test stat $estimate, $p_value, and $p_value_adjusted
(if p.adjust_method used is not "none") for each $pair of states involved in post hoc Dunn's tests obtained for each time step (i.e., $focal_time).
This data.frame can be passed down to plot_STRAPP_pvalues_over_time() to generate a plot showing the evolution of the post hoc test results across time.
Optional melted data.frames:
$trait_data_df_over_time Data.frame with three columns providing $trait_value associated with each $tip_ID found along each time step (i.e., $focal_time).
Set extract_trait_data_melted_df = TRUE to include it in the output.
$diversification_data_df_over_time Data.frame with six columns providing diversification regimes ($regime_ID) and $rates sorted by $rate_type along tips ($tip_ID)
found across all posterior samples ($BAMM_sample_ID) over each time step (i.e., $focal_time).
Set extract_diversification_data_melted_df = TRUE to include it in the output.
Those data.frames can be passed down to plot_rates_through_time() to generate a plot showing
the evolution diversification rates across trait values over time.
Optional objects generated for each time step (i.e., focal_time) and ordered as in $time_steps:
$STRAPP_results_over_time List of objects summarizing the results of the STRAPP tests
See compute_STRAPP_test_for_focal_time() for a detailed description of the elements in each object.
Set return_STRAPP_results = TRUE to include it in the output.
Combined with return_perm_data = TRUE, it allows to plot the histograms of the null distributions
used to assess significance of the tests with plot_histogram_STRAPP_test_for_focal_time().
(for a single focal_time) and plot_histograms_STRAPP_tests_over_time() (for multiple time_steps).
$updated_trait_data_with_Map_over_time List of objects containing trait data and updated contMap/densityMaps.
Updated contMap/densityMaps can be respectively plotted with plot_contMap() or plot_densityMaps_overlay(),
to display a phylogeny mapped with trait values with branches cut at each focal_time.
$updated_BAMM_objects_over_time List of objects containing rates and regimes ID mapped on phylogeny.
Updated BAMM_object can be plotted with plot_BAMM_rates() to display a phylogeny mapped with
diversification rates with branches cut at each focal_time.
Maël Doré
To run the deepSTRAPP workflow for a single focal_time: run_deepSTRAPP_for_focal_time()
extract_most_likely_trait_values_for_focal_time() update_rates_and_regimes_for_focal_time()
extract_diversification_data_melted_df_for_focal_time() compute_STRAPP_test_for_focal_time()
For a guided tutorial on complete deepSTRAPP workflow, see the associated vignettes:
For continuous trait data: vignette("deepSTRAPP_continuous_data", package = "deepSTRAPP")
For categorical trait data: vignette("deepSTRAPP_categorical_3lvl_data", package = "deepSTRAPP")
For biogeographic range data: vignette("deepSTRAPP_biogeographic_data", package = "deepSTRAPP")
if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) ## Run deepSTRAPP on net diversification rates Ponerinae_deepSTRAPP_cont_old_calib_0_40 <- run_deepSTRAPP_over_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly trait data output data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_0_40, max.level = 1) # Display test summary # Can be passed down to [deepSTRAPP::plot_STRAPP_pvalues_over_time()] to generate a plot # showing the evolution of the test results across time. Ponerinae_deepSTRAPP_cont_old_calib_0_40$pvalues_summary_df # Access trait data in a melted data.frame head(Ponerinae_deepSTRAPP_cont_old_calib_0_40$trait_data_df_over_time) # Access the diversification data in a melted data.frame head(Ponerinae_deepSTRAPP_cont_old_calib_0_40$diversification_data_df_over_time) # Access deepSTRAPP results str(Ponerinae_deepSTRAPP_cont_old_calib_0_40$STRAPP_results, max.level = 2) ## Visualize updated phylogenies # (May take time to plot) # Plot updated contMap for time step n°2 deepSTRAPP_outputs <- Ponerinae_deepSTRAPP_cont_old_calib_0_40 contMap_step2 <- deepSTRAPP_outputs$updated_trait_data_with_Map_over_time[[2]] plot_contMap(contMap_step2$contMap, ftype = "off") # Plot diversification rates on updated phylogeny for time step n°2 BAMM_object_step2 <- deepSTRAPP_outputs$updated_BAMM_objects_over_time[[2]] plot_BAMM_rates(BAMM_object = BAMM_object_step2, legend = TRUE, labels = FALSE, colorbreaks = BAMM_object_step2$initial_colorbreaks$net_diversification) ## Visualize test results # (May take time to plot) # Plot p-values of Spearman tests across all time-steps plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, alpha = 0.10) # Plot evolution of mean rates through time plot_rates_through_time(deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40) # Plot rates vs. trait values across branches for time step = 10 My plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, focal_time = 10) # Plot histogram of Spearman test stats for time step = 10 My plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, focal_time = 10) # Plot histograms of Spearman test results (One plot per time-step) plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, display_plots = TRUE) # ----- Example 2: Categorical trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = "ARD", nb_simulations = 100, return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly trait data output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates across time-steps. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, rate_type = "net_diversification", seed = 1234, alpha = 0.10, # Select a generous level of significance for the sake of the example posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly deepSTRAPP output data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") deepSTRAPP_outputs <- Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40 ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(deepSTRAPP_outputs, max.level = 1) # Display test summaries # Can be passed down to [deepSTRAPP::plot_STRAPP_pvalues_over_time()] to generate a plot # showing the evolution of the test results across time. deepSTRAPP_outputs$pvalues_summary_df # Results for posthoc pairwise Dunn's tests over time-steps deepSTRAPP_outputs$pvalues_summary_df_for_posthoc_pairwise_tests # Access trait data in a melted data.frame head(deepSTRAPP_outputs$trait_data_df_over_time) # Access the diversification data in a melted data.frame head(deepSTRAPP_outputs$diversification_data_df_over_time) # Access details of deepSTRAPP results str(deepSTRAPP_outputs$STRAPP_results, max.level = 2) ## Visualize updated phylogenies # (May take time to plot) # Plot updated densityMaps for time step n°2 = 10My densityMaps_10My <- deepSTRAPP_outputs$updated_trait_data_with_Map_over_time[[2]] plot_densityMaps_overlay(densityMaps_10My$densityMaps) # Plot diversification rates on updated phylogeny for time step n°2 BAMM_object_10My <- deepSTRAPP_outputs$updated_BAMM_objects_over_time[[2]] plot_BAMM_rates(BAMM_object = BAMM_object_10My, legend = TRUE, labels = FALSE, colorbreaks = BAMM_object_10My$initial_colorbreaks$net_diversification) ## Visualize test results # (May take time to plot) # Plot p-values of overall Kruskal-Wallis test across all time-steps plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, alpha = 0.10) # Plot p-values of post hoc pairwise Dunn's tests between pairs of tests across all time-steps plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, alpha = 0.10, plot_posthoc_tests = TRUE) # Plot evolution of mean rates through time plot_rates_through_time(deepSTRAPP_outputs = deepSTRAPP_outputs, colors_per_levels = colors_per_states) # Plot rates vs. trait values across branches for time step n°2 = 10 My plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_outputs, focal_time = 10, colors_per_levels = colors_per_states) # Plot histogram of overall Kruskal-Wallis test for time step n°2 = 10 My plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_outputs, focal_time = 10) # Plot histograms of overall Kruskal-Wallis test results across all time-steps # (One plot per time-step) plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, plot_posthoc_tests = FALSE) # Plot histograms of post hoc pairwise Dunn's test results across all time-steps # (One multifaceted plot per time-step) plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, plot_posthoc_tests = TRUE) # ----- Example 3: Biogeographic ranges ----- # ## Load data # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load trait df data(Ponerinae_binary_range_table, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare range data for Old World vs. New World # No overlap in ranges table(Ponerinae_binary_range_table$Old_World, Ponerinae_binary_range_table$New_World) Ponerinae_NO_data <- stats::setNames(object = Ponerinae_binary_range_table$Old_World, nm = Ponerinae_binary_range_table$Taxa) Ponerinae_NO_data <- as.character(Ponerinae_NO_data) Ponerinae_NO_data[Ponerinae_NO_data == "TRUE"] <- "O" # O = Old World Ponerinae_NO_data[Ponerinae_NO_data == "FALSE"] <- "N" # N = New World names(Ponerinae_NO_data) <- Ponerinae_binary_range_table$Taxa table(Ponerinae_NO_data) colors_per_ranges <- c("mediumpurple2", "peachpuff2") names(colors_per_ranges) <- c("N", "O") # (May take several minutes to run) ## Run evolutionary models Ponerinae_biogeo_data <- prepare_trait_data( tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", phylo = Ponerinae_tree_old_calib, evolutionary_models = "DEC+J", # Default = "DEC" for biogeographic BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "Ponerinae_old_calib", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to use only unique areas "A" and "B" nb_simulations = 100, # Reduce to save time (Default = '1000') colors_per_levels = colors_per_ranges, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(Ponerinae_biogeo_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows from 0 to 40 Mya. time_range <- c(0, 40) time_step_duration <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for time-steps = 0 to 40 Mya. Ponerinae_deepSTRAPP_biogeo_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_biogeo_data_old_calib$densityMaps, ace = Ponerinae_biogeo_data_old_calib$ace, tip_data = Ponerinae_ON_tip_data, trait_data_type = "biogeographic", BAMM_object = Ponerinae_BAMM_object_old_calib, time_range = time_range, time_step_duration = time_step_duration, seed = 1234, # Set seed for reproducibility alpha = 0.10, # Select a generous level of significance for the sake of the example rate_type = "net_diversification", return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly output data(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, package = "deepSTRAPP") deepSTRAPP_outputs <- Ponerinae_deepSTRAPP_biogeo_old_calib_0_40 ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(deepSTRAPP_outputs, max.level = 1) # Display test summaries # Can be passed down to [deepSTRAPP::plot_STRAPP_pvalues_over_time()] to generate a plot # showing the evolution of the test results across time. deepSTRAPP_outputs$pvalues_summary_df # Access bioregeographic range data in a melted data.frame head(deepSTRAPP_outputs$trait_data_df_over_time) # Access the diversification data in a melted data.frame head(deepSTRAPP_outputs$diversification_data_df_over_time) # Access details of deepSTRAPP results str(deepSTRAPP_outputs$STRAPP_results, max.level = 2) ## Visualize updated phylogenies # (May take time to plot) # Plot updated densityMaps for time step n°2 = 10My densityMaps_10My <- deepSTRAPP_outputs$updated_trait_data_with_Map_over_time[[2]] plot_densityMaps_overlay(densityMaps_10My$densityMaps) # Plot diversification rates on updated phylogeny for time step n°2 BAMM_object_10My <- deepSTRAPP_outputs$updated_BAMM_objects_over_time[[2]] plot_BAMM_rates(BAMM_object = BAMM_object_10My, legend = TRUE, labels = FALSE, colorbreaks = BAMM_object_10My$initial_colorbreaks$net_diversification) ## Visualize test results # (May take time to plot) # Plot p-values of Mann-Whitney-Wilcoxon tests across all time-steps plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, alpha = 0.05) # Plot evolution of mean rates through time plot_rates_through_time(deepSTRAPP_outputs = deepSTRAPP_outputs, colors_per_levels = colors_per_ranges) # Plot rates vs. trait values across branches for time step n°2 = 10 My plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_outputs, focal_time = 10, colors_per_levels = colors_per_ranges) # Plot histogram of Mann-Whitney-Wilcoxon test stats for time step n°2 = 10My plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_outputs, focal_time = 10) # Plot histograms of Mann-Whitney-Wilcoxon test stats for all time-steps (One plot per time-step) plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, display_plots = TRUE, plot_posthoc_tests = FALSE) }if (deepSTRAPP::is_dev_version()) { # ----- Example 1: Continuous trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny with old calibration data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract continuous trait data as a named vector Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data, nm = Ponerinae_trait_tip_data$Taxa) # Select a color scheme from lowest to highest values color_scale = c("darkgreen", "limegreen", "orange", "red") # Get Ancestral Character Estimates based on a Brownian Motion model # To obtain values at internal nodes Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data) # (May take several minutes to run) # Run a Stochastic Mapping based on a Brownian Motion model # to interpolate values along branches and obtain a "contMap" object Ponerinae_contMap <- phytools::contMap(Ponerinae_tree, x = Ponerinae_cont_tip_data, res = 100, # Number of time steps plot = FALSE) # Plot contMap = stochastic mapping of continuous trait plot_contMap(contMap = Ponerinae_contMap, color_scale = color_scale) ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) ## Run deepSTRAPP on net diversification rates Ponerinae_deepSTRAPP_cont_old_calib_0_40 <- run_deepSTRAPP_over_time( contMap = Ponerinae_contMap, ace = Ponerinae_ACE, tip_data = Ponerinae_cont_tip_data, trait_data_type = "continuous", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly trait data output data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(Ponerinae_deepSTRAPP_cont_old_calib_0_40, max.level = 1) # Display test summary # Can be passed down to [deepSTRAPP::plot_STRAPP_pvalues_over_time()] to generate a plot # showing the evolution of the test results across time. Ponerinae_deepSTRAPP_cont_old_calib_0_40$pvalues_summary_df # Access trait data in a melted data.frame head(Ponerinae_deepSTRAPP_cont_old_calib_0_40$trait_data_df_over_time) # Access the diversification data in a melted data.frame head(Ponerinae_deepSTRAPP_cont_old_calib_0_40$diversification_data_df_over_time) # Access deepSTRAPP results str(Ponerinae_deepSTRAPP_cont_old_calib_0_40$STRAPP_results, max.level = 2) ## Visualize updated phylogenies # (May take time to plot) # Plot updated contMap for time step n°2 deepSTRAPP_outputs <- Ponerinae_deepSTRAPP_cont_old_calib_0_40 contMap_step2 <- deepSTRAPP_outputs$updated_trait_data_with_Map_over_time[[2]] plot_contMap(contMap_step2$contMap, ftype = "off") # Plot diversification rates on updated phylogeny for time step n°2 BAMM_object_step2 <- deepSTRAPP_outputs$updated_BAMM_objects_over_time[[2]] plot_BAMM_rates(BAMM_object = BAMM_object_step2, legend = TRUE, labels = FALSE, colorbreaks = BAMM_object_step2$initial_colorbreaks$net_diversification) ## Visualize test results # (May take time to plot) # Plot p-values of Spearman tests across all time-steps plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, alpha = 0.10) # Plot evolution of mean rates through time plot_rates_through_time(deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40) # Plot rates vs. trait values across branches for time step = 10 My plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, focal_time = 10) # Plot histogram of Spearman test stats for time step = 10 My plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, focal_time = 10) # Plot histograms of Spearman test results (One plot per time-step) plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40, display_plots = TRUE) # ----- Example 2: Categorical trait ----- # ## Load data # Load trait df data(Ponerinae_trait_tip_data, package = "deepSTRAPP") # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare trait data # Extract categorical data with 3-levels Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data, nm = Ponerinae_trait_tip_data$Taxa) table(Ponerinae_cat_3lvl_tip_data) # Select color scheme for states colors_per_states <- c("forestgreen", "sienna", "goldenrod") names(colors_per_states) <- c("arboreal", "subterranean", "terricolous") # (May take several minutes to run) ## Produce densityMaps using stochastic character mapping based on an ARD Mk model Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data( tip_data = Ponerinae_cat_3lvl_tip_data, phylo = Ponerinae_tree_old_calib, trait_data_type = "categorical", colors_per_levels = colors_per_states, evolutionary_models = "ARD", nb_simulations = 100, return_best_model_fit = TRUE, return_model_selection_df = TRUE, plot_map = FALSE) # Load directly trait data output data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows for 0 to 40 Mya. # nb_time_steps <- 5 time_step_duration <- 5 time_range <- c(0, 40) # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates across time-steps. Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_cat_3lvl_data_old_calib$densityMaps, ace = Ponerinae_cat_3lvl_data_old_calib$ace, tip_data = Ponerinae_cat_3lvl_tip_data, trait_data_type = "categorical", BAMM_object = Ponerinae_BAMM_object_old_calib, # nb_time_steps = nb_time_steps, time_range = time_range, time_step_duration = time_step_duration, rate_type = "net_diversification", seed = 1234, alpha = 0.10, # Select a generous level of significance for the sake of the example posthoc_pairwise_tests = TRUE, return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly deepSTRAPP output data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP") deepSTRAPP_outputs <- Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40 ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(deepSTRAPP_outputs, max.level = 1) # Display test summaries # Can be passed down to [deepSTRAPP::plot_STRAPP_pvalues_over_time()] to generate a plot # showing the evolution of the test results across time. deepSTRAPP_outputs$pvalues_summary_df # Results for posthoc pairwise Dunn's tests over time-steps deepSTRAPP_outputs$pvalues_summary_df_for_posthoc_pairwise_tests # Access trait data in a melted data.frame head(deepSTRAPP_outputs$trait_data_df_over_time) # Access the diversification data in a melted data.frame head(deepSTRAPP_outputs$diversification_data_df_over_time) # Access details of deepSTRAPP results str(deepSTRAPP_outputs$STRAPP_results, max.level = 2) ## Visualize updated phylogenies # (May take time to plot) # Plot updated densityMaps for time step n°2 = 10My densityMaps_10My <- deepSTRAPP_outputs$updated_trait_data_with_Map_over_time[[2]] plot_densityMaps_overlay(densityMaps_10My$densityMaps) # Plot diversification rates on updated phylogeny for time step n°2 BAMM_object_10My <- deepSTRAPP_outputs$updated_BAMM_objects_over_time[[2]] plot_BAMM_rates(BAMM_object = BAMM_object_10My, legend = TRUE, labels = FALSE, colorbreaks = BAMM_object_10My$initial_colorbreaks$net_diversification) ## Visualize test results # (May take time to plot) # Plot p-values of overall Kruskal-Wallis test across all time-steps plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, alpha = 0.10) # Plot p-values of post hoc pairwise Dunn's tests between pairs of tests across all time-steps plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, alpha = 0.10, plot_posthoc_tests = TRUE) # Plot evolution of mean rates through time plot_rates_through_time(deepSTRAPP_outputs = deepSTRAPP_outputs, colors_per_levels = colors_per_states) # Plot rates vs. trait values across branches for time step n°2 = 10 My plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_outputs, focal_time = 10, colors_per_levels = colors_per_states) # Plot histogram of overall Kruskal-Wallis test for time step n°2 = 10 My plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_outputs, focal_time = 10) # Plot histograms of overall Kruskal-Wallis test results across all time-steps # (One plot per time-step) plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, plot_posthoc_tests = FALSE) # Plot histograms of post hoc pairwise Dunn's test results across all time-steps # (One multifaceted plot per time-step) plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, plot_posthoc_tests = TRUE) # ----- Example 3: Biogeographic ranges ----- # ## Load data # Load phylogeny data(Ponerinae_tree_old_calib, package = "deepSTRAPP") # Load trait df data(Ponerinae_binary_range_table, package = "deepSTRAPP") # Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Prepare range data for Old World vs. New World # No overlap in ranges table(Ponerinae_binary_range_table$Old_World, Ponerinae_binary_range_table$New_World) Ponerinae_NO_data <- stats::setNames(object = Ponerinae_binary_range_table$Old_World, nm = Ponerinae_binary_range_table$Taxa) Ponerinae_NO_data <- as.character(Ponerinae_NO_data) Ponerinae_NO_data[Ponerinae_NO_data == "TRUE"] <- "O" # O = Old World Ponerinae_NO_data[Ponerinae_NO_data == "FALSE"] <- "N" # N = New World names(Ponerinae_NO_data) <- Ponerinae_binary_range_table$Taxa table(Ponerinae_NO_data) colors_per_ranges <- c("mediumpurple2", "peachpuff2") names(colors_per_ranges) <- c("N", "O") # (May take several minutes to run) ## Run evolutionary models Ponerinae_biogeo_data <- prepare_trait_data( tip_data = Ponerinae_NO_data, trait_data_type = "biogeographic", phylo = Ponerinae_tree_old_calib, evolutionary_models = "DEC+J", # Default = "DEC" for biogeographic BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/" keep_BioGeoBEARS_files = FALSE, prefix_for_files = "Ponerinae_old_calib", max_range_size = 2, split_multi_area_ranges = TRUE, # Set to TRUE to use only unique areas "A" and "B" nb_simulations = 100, # Reduce to save time (Default = '1000') colors_per_levels = colors_per_ranges, return_model_selection_df = TRUE, verbose = TRUE) # Load directly output data(Ponerinae_biogeo_data_old_calib, package = "deepSTRAPP") ## Set for time steps of 5 My. Will generate deepSTRAPP workflows from 0 to 40 Mya. time_range <- c(0, 40) time_step_duration <- 10 # (May take several minutes to run) ## Run deepSTRAPP on net diversification rates for time-steps = 0 to 40 Mya. Ponerinae_deepSTRAPP_biogeo_old_calib_0_40 <- run_deepSTRAPP_over_time( densityMaps = Ponerinae_biogeo_data_old_calib$densityMaps, ace = Ponerinae_biogeo_data_old_calib$ace, tip_data = Ponerinae_ON_tip_data, trait_data_type = "biogeographic", BAMM_object = Ponerinae_BAMM_object_old_calib, time_range = time_range, time_step_duration = time_step_duration, seed = 1234, # Set seed for reproducibility alpha = 0.10, # Select a generous level of significance for the sake of the example rate_type = "net_diversification", return_perm_data = TRUE, extract_trait_data_melted_df = TRUE, extract_diversification_data_melted_df = TRUE, return_STRAPP_results = TRUE, return_updated_trait_data_with_Map = TRUE, return_updated_BAMM_object = TRUE, verbose = TRUE, verbose_extended = TRUE) ## Load directly output data(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, package = "deepSTRAPP") deepSTRAPP_outputs <- Ponerinae_deepSTRAPP_biogeo_old_calib_0_40 ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Explore output str(deepSTRAPP_outputs, max.level = 1) # Display test summaries # Can be passed down to [deepSTRAPP::plot_STRAPP_pvalues_over_time()] to generate a plot # showing the evolution of the test results across time. deepSTRAPP_outputs$pvalues_summary_df # Access bioregeographic range data in a melted data.frame head(deepSTRAPP_outputs$trait_data_df_over_time) # Access the diversification data in a melted data.frame head(deepSTRAPP_outputs$diversification_data_df_over_time) # Access details of deepSTRAPP results str(deepSTRAPP_outputs$STRAPP_results, max.level = 2) ## Visualize updated phylogenies # (May take time to plot) # Plot updated densityMaps for time step n°2 = 10My densityMaps_10My <- deepSTRAPP_outputs$updated_trait_data_with_Map_over_time[[2]] plot_densityMaps_overlay(densityMaps_10My$densityMaps) # Plot diversification rates on updated phylogeny for time step n°2 BAMM_object_10My <- deepSTRAPP_outputs$updated_BAMM_objects_over_time[[2]] plot_BAMM_rates(BAMM_object = BAMM_object_10My, legend = TRUE, labels = FALSE, colorbreaks = BAMM_object_10My$initial_colorbreaks$net_diversification) ## Visualize test results # (May take time to plot) # Plot p-values of Mann-Whitney-Wilcoxon tests across all time-steps plot_STRAPP_pvalues_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, alpha = 0.05) # Plot evolution of mean rates through time plot_rates_through_time(deepSTRAPP_outputs = deepSTRAPP_outputs, colors_per_levels = colors_per_ranges) # Plot rates vs. trait values across branches for time step n°2 = 10 My plot_rates_vs_trait_data_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_outputs, focal_time = 10, colors_per_levels = colors_per_ranges) # Plot histogram of Mann-Whitney-Wilcoxon test stats for time step n°2 = 10My plot_histogram_STRAPP_test_for_focal_time( deepSTRAPP_outputs = deepSTRAPP_outputs, focal_time = 10) # Plot histograms of Mann-Whitney-Wilcoxon test stats for all time-steps (One plot per time-step) plot_histograms_STRAPP_tests_over_time( deepSTRAPP_outputs = deepSTRAPP_outputs, display_plots = TRUE, plot_posthoc_tests = FALSE) }
Compare models fit with BioGeoBEARS::bear_optim_run() using AICc and Akaike's weights.
Generate a data.frame summarizing information. Identify the best model and extract its results.
select_best_model_from_BioGeoBEARS(list_model_fits)select_best_model_from_BioGeoBEARS(list_model_fits)
list_model_fits |
Named list with the results of a model fit with |
The function returns a list with three elements.
$model_comparison_df Data.frame summarizing information to compare model fits. It includes the model name ($model),
the log-likelihood ($logLik), the number of free-parameters ($k), the AIC ($AIC), the corrected AIC ($AICc),
the delta to the best/lowest AICc ($delta_AICc), the Akaike weights ($Akaike_weights), and their rank based on AICc ($rank).
$best_model_name Character string. Name of the best model.
$best_model_fit List containing the output of BioGeoBEARS::bear_optim_run() for the model with the best fit.
Maël Doré
BioGeoBEARS::bear_optim_run() BioGeoBEARS::get_LnL_from_BioGeoBEARS_results_object() BioGeoBEARS::AICstats_2models()
if (deepSTRAPP::is_dev_version()) { ## The R package 'BioGeoBEARS' is needed for this function to work with biogeographic data. # Please install it manually from: https://github.com/nmatzke/BioGeoBEARS. # Load phylogeny and tip data library(phytools) data(eel.tree) data(eel.data) # Transform feeding mode data into biogeographic data with ranges A, B, and AB. eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) eel_data <- as.character(eel_data) eel_data[eel_data == "bite"] <- "A" eel_data[eel_data == "suction"] <- "B" eel_data[c(5, 6, 7, 15, 25, 32, 33, 34, 50, 52, 57, 58, 59)] <- "AB" eel_data <- stats::setNames(eel_data, rownames(eel.data)) table(eel_data) colors_per_levels <- c("dodgerblue3", "gold") names(colors_per_levels) <- c("A", "B") # (May take several minutes to run) ## Prepare phylo # Set path to BioGeoBEARS directory BioGeoBEARS_directory_path = "./BioGeoBEARS_directory/" # Export phylo path_to_phylo <- BioGeoBEARS::np(paste0(BioGeoBEARS_directory_path, "phylo.tree")) write.tree(phy = eel.tree, file = path_to_phylo) ## Prepare range data tip_data <- eel_data # Convert tip_data to df ranges_df <- as.data.frame(tip_data) # Get list of all unique areas unique_areas_in_ranges_list <- strsplit(x = tip_data, split = "") unique_areas <- unique(unlist(unique_areas_in_ranges_list)) unique_areas <- unique_areas[order(unique_areas)] # Loop per unique areas for (i in seq_along(unique_areas)) { # i <- 1 # Extract unique area unique_area_i <- unique_areas[i] # Detect presence in ranges binary_match_i <- unlist(lapply(X = unique_areas_in_ranges_list, FUN = function (x) { unique_area_i %in% x } )) # Add to ranges_df ranges_df <- cbind(ranges_df, binary_match_i) } # Extract binary df of presence/absence binary_df <- ranges_df[, -1] # Convert character strings into numerical factors binary_df_num <- as.data.frame(apply(X = binary_df, MARGIN = 2, FUN = as.numeric)) row.names(binary_df_num) <- names(tip_data) names(binary_df_num) <- unique_areas # Produce tipranges object from numeric df Taxa_bioregions_tipranges_obj <- BioGeoBEARS::define_tipranges_object(tmpdf = binary_df_num) # Set path to tip ranges object path_to_tip_ranges <- BioGeoBEARS::np(paste0(BioGeoBEARS_directory_path,"tip_ranges.data")) # Export tip ranges in Lagrange/PHYLIP format BioGeoBEARS::save_tipranges_to_LagrangePHYLIP( tipranges_object = Taxa_bioregions_tipranges_obj, lgdata_fn = path_to_tip_ranges, areanames = colnames(Taxa_bioregions_tipranges_obj@df)) ## Prepare models # Prepare DEC model run DEC_run <- BioGeoBEARS::define_BioGeoBEARS_run( num_cores_to_use = 1, max_range_size = 2, # To set the maximum number of bioregion encompassed # by a lineage range at any time trfn = path_to_phylo, # To provide path to the input tree file geogfn = path_to_tip_ranges, # To provide path to the LagrangePHYLIP file with binary ranges return_condlikes_table = TRUE, # To ask to obtain all marginal likelihoods # computed by the model and used to display ancestral states print_optim = TRUE) # Check that starting parameter values are inside the min/max DEC_run <- BioGeoBEARS::fix_BioGeoBEARS_params_minmax(BioGeoBEARS_run_object = DEC_run) # Check validity of set-up before run BioGeoBEARS::check_BioGeoBEARS_run(DEC_run) # Use the DEC model as template for DEC+J model DEC_J_run <- DEC_run # Update status of jump speciation parameter to be estimated DEC_J_run$BioGeoBEARS_model_object@params_table["j","type"] <- "free" # Set initial value of J for optimization to an arbitrary low non-null value j_start <- 0.0001 DEC_J_run$BioGeoBEARS_model_object@params_table["j","init"] <- j_start DEC_J_run$BioGeoBEARS_model_object@params_table["j","est"] <- j_start # Check validity of set-up before run DEC_J_run <- BioGeoBEARS::fix_BioGeoBEARS_params_minmax(BioGeoBEARS_run_object = DEC_J_run) invisible(BioGeoBEARS::check_BioGeoBEARS_run(DEC_J_run)) ## Run models # Run DEC model DEC_fit <- BioGeoBEARS::bears_optim_run(DEC_run) # Set starting values for optimization of DEC+J model based on MLE in the DEC model d_start <- DEC_fit$outputs@params_table["d","est"] e_start <- DEC_fit$outputs@params_table["e","est"] DEC_J_run$BioGeoBEARS_model_object@params_table["d","init"] <- d_start DEC_J_run$BioGeoBEARS_model_object@params_table["d","est"] <- d_start DEC_J_run$BioGeoBEARS_model_object@params_table["e","init"] <- e_start DEC_J_run$BioGeoBEARS_model_object@params_table["e","est"] <- e_start # Run DEC+J model DEC_J_fit <- BioGeoBEARS::bears_optim_run(DEC_J_run) ## Store model outputs list_model_fits <- list(DEC = DEC_fit, DEC_J = DEC_J_fit) ## Compare models model_comparison_output <- select_best_model_from_BioGeoBEARS(list_model_fits = list_model_fits) # Explore output str(model_comparison_output, max.level = 1) # Print comparison print(model_comparison_output$models_comparison_df) # Print best model fit print(model_comparison_output$best_model_fit$outputs) ## Clean local files unlink(BioGeoBEARS_directory_path, recursive = TRUE, force = TRUE) }if (deepSTRAPP::is_dev_version()) { ## The R package 'BioGeoBEARS' is needed for this function to work with biogeographic data. # Please install it manually from: https://github.com/nmatzke/BioGeoBEARS. # Load phylogeny and tip data library(phytools) data(eel.tree) data(eel.data) # Transform feeding mode data into biogeographic data with ranges A, B, and AB. eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) eel_data <- as.character(eel_data) eel_data[eel_data == "bite"] <- "A" eel_data[eel_data == "suction"] <- "B" eel_data[c(5, 6, 7, 15, 25, 32, 33, 34, 50, 52, 57, 58, 59)] <- "AB" eel_data <- stats::setNames(eel_data, rownames(eel.data)) table(eel_data) colors_per_levels <- c("dodgerblue3", "gold") names(colors_per_levels) <- c("A", "B") # (May take several minutes to run) ## Prepare phylo # Set path to BioGeoBEARS directory BioGeoBEARS_directory_path = "./BioGeoBEARS_directory/" # Export phylo path_to_phylo <- BioGeoBEARS::np(paste0(BioGeoBEARS_directory_path, "phylo.tree")) write.tree(phy = eel.tree, file = path_to_phylo) ## Prepare range data tip_data <- eel_data # Convert tip_data to df ranges_df <- as.data.frame(tip_data) # Get list of all unique areas unique_areas_in_ranges_list <- strsplit(x = tip_data, split = "") unique_areas <- unique(unlist(unique_areas_in_ranges_list)) unique_areas <- unique_areas[order(unique_areas)] # Loop per unique areas for (i in seq_along(unique_areas)) { # i <- 1 # Extract unique area unique_area_i <- unique_areas[i] # Detect presence in ranges binary_match_i <- unlist(lapply(X = unique_areas_in_ranges_list, FUN = function (x) { unique_area_i %in% x } )) # Add to ranges_df ranges_df <- cbind(ranges_df, binary_match_i) } # Extract binary df of presence/absence binary_df <- ranges_df[, -1] # Convert character strings into numerical factors binary_df_num <- as.data.frame(apply(X = binary_df, MARGIN = 2, FUN = as.numeric)) row.names(binary_df_num) <- names(tip_data) names(binary_df_num) <- unique_areas # Produce tipranges object from numeric df Taxa_bioregions_tipranges_obj <- BioGeoBEARS::define_tipranges_object(tmpdf = binary_df_num) # Set path to tip ranges object path_to_tip_ranges <- BioGeoBEARS::np(paste0(BioGeoBEARS_directory_path,"tip_ranges.data")) # Export tip ranges in Lagrange/PHYLIP format BioGeoBEARS::save_tipranges_to_LagrangePHYLIP( tipranges_object = Taxa_bioregions_tipranges_obj, lgdata_fn = path_to_tip_ranges, areanames = colnames(Taxa_bioregions_tipranges_obj@df)) ## Prepare models # Prepare DEC model run DEC_run <- BioGeoBEARS::define_BioGeoBEARS_run( num_cores_to_use = 1, max_range_size = 2, # To set the maximum number of bioregion encompassed # by a lineage range at any time trfn = path_to_phylo, # To provide path to the input tree file geogfn = path_to_tip_ranges, # To provide path to the LagrangePHYLIP file with binary ranges return_condlikes_table = TRUE, # To ask to obtain all marginal likelihoods # computed by the model and used to display ancestral states print_optim = TRUE) # Check that starting parameter values are inside the min/max DEC_run <- BioGeoBEARS::fix_BioGeoBEARS_params_minmax(BioGeoBEARS_run_object = DEC_run) # Check validity of set-up before run BioGeoBEARS::check_BioGeoBEARS_run(DEC_run) # Use the DEC model as template for DEC+J model DEC_J_run <- DEC_run # Update status of jump speciation parameter to be estimated DEC_J_run$BioGeoBEARS_model_object@params_table["j","type"] <- "free" # Set initial value of J for optimization to an arbitrary low non-null value j_start <- 0.0001 DEC_J_run$BioGeoBEARS_model_object@params_table["j","init"] <- j_start DEC_J_run$BioGeoBEARS_model_object@params_table["j","est"] <- j_start # Check validity of set-up before run DEC_J_run <- BioGeoBEARS::fix_BioGeoBEARS_params_minmax(BioGeoBEARS_run_object = DEC_J_run) invisible(BioGeoBEARS::check_BioGeoBEARS_run(DEC_J_run)) ## Run models # Run DEC model DEC_fit <- BioGeoBEARS::bears_optim_run(DEC_run) # Set starting values for optimization of DEC+J model based on MLE in the DEC model d_start <- DEC_fit$outputs@params_table["d","est"] e_start <- DEC_fit$outputs@params_table["e","est"] DEC_J_run$BioGeoBEARS_model_object@params_table["d","init"] <- d_start DEC_J_run$BioGeoBEARS_model_object@params_table["d","est"] <- d_start DEC_J_run$BioGeoBEARS_model_object@params_table["e","init"] <- e_start DEC_J_run$BioGeoBEARS_model_object@params_table["e","est"] <- e_start # Run DEC+J model DEC_J_fit <- BioGeoBEARS::bears_optim_run(DEC_J_run) ## Store model outputs list_model_fits <- list(DEC = DEC_fit, DEC_J = DEC_J_fit) ## Compare models model_comparison_output <- select_best_model_from_BioGeoBEARS(list_model_fits = list_model_fits) # Explore output str(model_comparison_output, max.level = 1) # Print comparison print(model_comparison_output$models_comparison_df) # Print best model fit print(model_comparison_output$best_model_fit$outputs) ## Clean local files unlink(BioGeoBEARS_directory_path, recursive = TRUE, force = TRUE) }
Compare evolutionary models fit with geiger::fitContinuous() or geiger::fitDiscrete()
using AICc and Akaike's weights. Generate a data.frame summarizing information.
Identify the best model and extract its results.
select_best_trait_model_from_geiger(list_model_fits)select_best_trait_model_from_geiger(list_model_fits)
list_model_fits |
Named list with the results of a model fit with |
The function returns a list with three elements.
$model_comparison_df Data.frame summarizing information to compare model fits. It includes the model name ($model),
the log-likelihood ($logLik), the number of free-parameters ($k), the AIC ($AICc), the corrected AIC ($AICc),
the delta to the best/lowest AICc ($delta_AICc), the Akaike weights ($Akaike_weights), and their rank based on AICc ($rank).
$best_model_name Character string. Name of the best model.
$best_model_fit List containing the output of geiger::fitContinuous() or geiger::fitDiscrete() for the model with the best fit.
Maël Doré
geiger::fitContinuous() geiger::fitDiscrete()
# ----- Example 1: Continuous data ----- # # Load phylogeny and tip data library(phytools) data(eel.tree) data(eel.data) # Extract body size eel_data <- stats::setNames(eel.data$Max_TL_cm, rownames(eel.data)) # (May take several minutes to run) # Fit BM model BM_fit <- geiger::fitContinuous(phy = eel.tree, dat = eel_data, model = "BM") # Fit EB model EB_fit <- geiger::fitContinuous(phy = eel.tree, dat = eel_data, model = "EB") # Fit OU model OU_fit <- geiger::fitContinuous(phy = eel.tree, dat = eel_data, model = "OU") # Store models list_model_fits <- list(BM = BM_fit, EB = EB_fit, OU = OU_fit) # Compare models model_comparison_output <- select_best_trait_model_from_geiger(list_model_fits = list_model_fits) # Explore output str(model_comparison_output, max.level = 2) # Print comparison print(model_comparison_output$models_comparison_df) # Print best model fit print(model_comparison_output$best_model_fit) # ----- Example 2: Categorical data ----- # # Load phylogeny and tip data library(phytools) data(eel.tree) data(eel.data) # Extract feeding mode data eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) table(eel_data) # (May take several minutes to run) # Fit ER model ER_fit <- geiger::fitDiscrete(phy = eel.tree, dat = eel_data, model = "ER") # Fit ARD model ARD_fit <- geiger::fitDiscrete(phy = eel.tree, dat = eel_data, model = "ARD") # Store models list_model_fits <- list(ER = ER_fit, ARD = ARD_fit) # Compare models model_comparison_output <- select_best_trait_model_from_geiger(list_model_fits = list_model_fits) # Explore output str(model_comparison_output, max.level = 2) # Print comparison print(model_comparison_output$models_comparison_df) # Print best model fit print(model_comparison_output$best_model_fit)# ----- Example 1: Continuous data ----- # # Load phylogeny and tip data library(phytools) data(eel.tree) data(eel.data) # Extract body size eel_data <- stats::setNames(eel.data$Max_TL_cm, rownames(eel.data)) # (May take several minutes to run) # Fit BM model BM_fit <- geiger::fitContinuous(phy = eel.tree, dat = eel_data, model = "BM") # Fit EB model EB_fit <- geiger::fitContinuous(phy = eel.tree, dat = eel_data, model = "EB") # Fit OU model OU_fit <- geiger::fitContinuous(phy = eel.tree, dat = eel_data, model = "OU") # Store models list_model_fits <- list(BM = BM_fit, EB = EB_fit, OU = OU_fit) # Compare models model_comparison_output <- select_best_trait_model_from_geiger(list_model_fits = list_model_fits) # Explore output str(model_comparison_output, max.level = 2) # Print comparison print(model_comparison_output$models_comparison_df) # Print best model fit print(model_comparison_output$best_model_fit) # ----- Example 2: Categorical data ----- # # Load phylogeny and tip data library(phytools) data(eel.tree) data(eel.data) # Extract feeding mode data eel_data <- stats::setNames(eel.data$feed_mode, rownames(eel.data)) table(eel_data) # (May take several minutes to run) # Fit ER model ER_fit <- geiger::fitDiscrete(phy = eel.tree, dat = eel_data, model = "ER") # Fit ARD model ARD_fit <- geiger::fitDiscrete(phy = eel.tree, dat = eel_data, model = "ARD") # Store models list_model_fits <- list(ER = ER_fit, ARD = ARD_fit) # Compare models model_comparison_output <- select_best_trait_model_from_geiger(list_model_fits = list_model_fits) # Explore output str(model_comparison_output, max.level = 2) # Print comparison print(model_comparison_output$models_comparison_df) # Print best model fit print(model_comparison_output$best_model_fit)
Updates an object of class "bammdata" to obtain the diversification rates/regimes
found along branches at a specific time in the past (i.e. the focal_time).
Optionally, the function can update the object to display a mapped phylogeny such as
branches overlapping the focal_time are shorten to the focal_time.
update_rates_and_regimes_for_focal_time( BAMM_object, focal_time, update_rates = TRUE, update_regimes = TRUE, update_tree = FALSE, update_plot = FALSE, update_all_elements = FALSE, keep_tip_labels = TRUE, verbose = TRUE )update_rates_and_regimes_for_focal_time( BAMM_object, focal_time, update_rates = TRUE, update_regimes = TRUE, update_tree = FALSE, update_plot = FALSE, update_all_elements = FALSE, keep_tip_labels = TRUE, verbose = TRUE )
BAMM_object |
Object of class |
focal_time |
Numerical. The time, in terms of time distance from the present, at which the tree and rate mapping must be cut. It must be smaller than the root age of the phylogeny. |
update_rates |
Logical. Specify whether diversification rates stored in
|
update_regimes |
Logical. Specify whether diversification regimes stored in
|
update_tree |
Logical. Specify whether to update the phylogeny such as
all branches that are younger than the |
update_plot |
Logical. Specify whether to update the phylogeny AND the elements
used by |
update_all_elements |
Logical. Specify whether to update all the elements in the object, including
rates/regimes/phylogeny/elements for |
keep_tip_labels |
Logical. Should terminal branches with a single descendant tip
retain their initial |
verbose |
Logical. Should progression be displayed? A message will be printed for every batch of
100 BAMM posterior samples updated. Default is |
The object of class "bammdata" (BAMM_object) is cut-off at a specific time in the past
(i.e. the focal_time) and the current diversification rate values of the overlapping edges/branches are extracted.
—– Update diversification rate data —–
If update_rates = TRUE, diversification rates of branches overlapping with focal_time
will be updated. Each cut-off branches form a new tip present at focal_time with updated associated
diversification rate data. Fossils older than focal_time do not yield any data.
$tipLambda contains speciation rates.
$tipMu contains extinction rates.
If update_regimes = TRUE, diversification regimes of branches overlapping with focal_time
will be updated. Each cut-off branches form a new tip present at focal_time with updated associated
diversification regime ID found in $tipStates. Fossils older than focal_time do not yield any data.
—– Update the phylogeny —–
If update_tree = TRUE, elements defining the phylogeny, as in an "phylo" object
will be updated such as all branches that are younger than the focal_time are cut-off:
$edge defines the tree topology.
$Nnode defines the number of internal nodes.
$tip.label provides the labels of all tips, including fossils older than focal_time if present.
$edge.length provides length of all branches.
$node.label provides the labels of all internal nodes. (Optional)
—– Update the plot from plot_BAMM_rates() —–
If update_plot = TRUE, elements used to plot diversification rates on the phylogeny
using plot_BAMM_rates() will be updated such as all branches that are younger
than the focal_time are cut-off:
$begin provides absolute time since root of edge/branch start (rootward).
$end provides absolute time since root of edge/branch end (tipward).
$eventVectors provides regime membership per branches in each posterior sample configuration.
$eventBranchSegs provides regime membership per segments of branches in each posterior sample configuration.
$dtrates provides mean speciation and extinction rates along segments of branches, and resolution fraction (tau) describing
the fraction of each segment length compared to the full depth of the initial tree (i.e., the root_age).
The function returns a list as an updated BAMM_object of class "bammdata".
Phylogeny-related elements used to plot a phylogeny with ape::plot.phylo():
$edge Matrix of integers. Defines the tree topology by providing rootward and tipward node ID of each edge.
$Nnode Integer. Number of internal nodes.
$tip.label Vector of character strings. Labels of all tips, including fossils older than focal_time if present.
If keep_tip_labels = TRUE, cut-off branches with a single descendant tip retain their initial tip.label.
If keep_tip_labels = FALSE, all cut-off branches are labeled using their tipward node ID.
$edge.length Vector of numerical. Length of edges/branches.
$node.label Vector of character strings. Labels of all internal nodes. (Present only if present in the initial BAMM_object)
BAMM internal elements used for tree exploration:
$begin Vector of numerical. Absolute time since root of edge/branch start (rootward).
$end Vector of numerical. Absolute time since root of edge/branch end (tipward).
$downseq Vector of integers. Order of node visits when using a pre-order tree traversal.
$lastvisit ID of the last node visited when starting from the node in the corresponding position in $downseq.
BAMM elements summarizing diversification data:
$numberEvents Vector of integer. Number of events/macroevolutionary regimes (k+1) recorded in each posterior configuration. k = number of shifts.
$eventData List of data.frames. One per posterior sample. Records shift events and macroevolutionary regimes parameters. 1st line = Background root regime.
$eventVectors List of integer vectors. One per posterior sample. Record regime ID per branches.
$tipStates List of named integer vectors. One per posterior sample. Record regime ID per tips present at focal_time. Updated if update_regimes = TRUE.
$tipLambda List of named numerical vectors. One per posterior sample. Record speciation rates per tips present at focal_time. Updated if update_rates = TRUE.
$tipMu List of named numerical vectors. One per posterior sample. Record extinction rates per tips present at focal_time. Updated if update_rates = TRUE.
$eventBranchSegs List of matrix of numerical. One per posterior sample. Record regime ID per segments of branches.
$meanTipLambda Vector of named numerical. Mean tip speciation rates across all posterior configurations of tips present at focal_time (does not includes older fossils).
$meanTipMu Vector of named numerical. Mean tip extinction rates across all posterior configurations of tips present at focal_time (does not includes older fossils).
$type Character string. Set the type of data modeled with BAMM. Should be "diversification".
Additional elements providing key information for downstream analyses:
$expectedNumberOfShifts Integer. The expected number of regime shifts used to set the prior in BAMM.
$MSP_tree Object of class phylo. List of 4 elements duplicating information from the Phylogeny-related elements above,
except $MSP_tree$edge.length is recording the Marginal Shift Probability of each branch (i.e., the probability of a regime shift to occur along each branch)
whose origin is older that focal_time.
$MAP_indices Vector of integers. The indices of the Maximum A Posteriori probability (MAP) configurations among the posterior samples.
$MAP_BAMM_object. List of 18 elements of class "bammdata" recording the mean rates and regime shift locations found across the Maximum A Posteriori probability (MAP) configuration. All BAMM elements summarizing diversification data holds a single entry describing this the mean diversification history, updated for the focal_time'.
$MSC_indices Vector of integers. The indices of the Maximum Shift Credibility (MSC) configurations among the posterior samples.
$MSC_BAMM_object List of 18 elements of class "bammdata" recording the mean rates and regime shift locations found across the Maximum Shift Credibility (MSC) configurations. All BAMM elements summarizing diversification data holds a single entry describing this mean diversification history, updated for the focal_time'.
New elements added to provide update information:
$root_age Integer. Stores the age of the root of the tree.
$nodes_ID_df Data.frame with two columns. Provides the conversion from the new_node_ID to the initial_node_ID. Each row is a node.
$initial_nodes_ID Vector of character strings. Provides the initial ID of internal nodes. Used to plot internal node IDs as labels with ape::nodelabels().
$edges_ID_df Data.frame with two columns. Provides the conversion from the new_edge_ID to the initial_edge_ID. Each row is an edge/branch.
$initial_edges_ID Vector of character strings. Provides the initial ID of edges/branches. Used to plot edge/branch IDs as labels with ape::edgelabels().
$dtrates List of three elements.
1/ $dtrates$tau Numerical. Resolution factor describing the fraction of each segment length used in plot_BAMM_rates()
compared to the full depth of the initial tree (i.e., the root_age)
2/ $dtrates$rates List of two numerical vectors. Speciation and extinction rates along segments used by plot_BAMM_rates().
3/ $dtrates$tmat Matrix of numerical. Start and end times of segments in term of distance to the root.
$initial_colorbreaks List of three vectors of numerical. Rate values of the percentiles delimiting the bins for mapping rates to colors with BAMMtools::plot.bammdata().
Each element provides values for different type of rates ($speciation, $extinction, $net_diversification).
$focal_time Integer. The time, in terms of time distance from the present, at which the rates/regimes were extracted and the tree was eventually cut.
Maël Doré
cut_phylo_for_focal_time() plot_BAMM_rates()
# ----- Example 1: Extant whales (87 taxa) ----- # ## Load the BAMM_object summarizing 1000 posterior samples of BAMM data(whale_BAMM_object, package = "deepSTRAPP") ## Set focal-time to 5 My focal_time = 5 # (May take several minutes to run) ## Update the BAMM object whale_BAMM_object_5My <- update_rates_and_regimes_for_focal_time( BAMM_object = whale_BAMM_object, focal_time = 5, update_rates = TRUE, update_regimes = TRUE, update_tree = TRUE, update_plot = TRUE, update_all_elements = TRUE, keep_tip_labels = TRUE, verbose = TRUE) # Add "phylo" class to be compatible with phytools::nodeHeights() class(whale_BAMM_object) <- unique(c(class(whale_BAMM_object), "phylo")) root_age <- max(phytools::nodeHeights(whale_BAMM_object)[,2]) # Remove temporary "phylo" class class(whale_BAMM_object) <- setdiff(class(whale_BAMM_object), "phylo") ## Plot initial BAMM_object for t = 0 My plot_BAMM_rates(whale_BAMM_object, add_regime_shifts = TRUE, labels = TRUE, legend = TRUE, cex = 0.5, par.reset = FALSE) # Keep plotting parameters in memory to use abline(). abline(v = root_age - focal_time, col = "red", lty = 2, lwd = 2) ## Plot updated BAMM_object for t = 5 My plot_BAMM_rates(whale_BAMM_object_5My, add_regime_shifts = TRUE, labels = TRUE, legend = TRUE, cex = 0.8) # ----- Example 2: Extant Ponerinae (1,534 taxa) ----- # if (deepSTRAPP::is_dev_version()) { ## Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Set focal-time to 10 My focal_time = 10 # (May take several minutes to run) ## Update the BAMM object Ponerinae_BAMM_object_10My <- update_rates_and_regimes_for_focal_time( BAMM_object = Ponerinae_BAMM_object, focal_time = focal_time, update_rates = TRUE, update_regimes = TRUE, update_tree = TRUE, update_plot = TRUE, update_all_elements = TRUE, keep_tip_labels = TRUE, verbose = TRUE) ## Load results to save time data(Ponerinae_BAMM_object_10My, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Extract root age # Add temporarily the "phylo" class to be compatible with phytools::nodeHeights() class(Ponerinae_BAMM_object) <- unique(c(class(Ponerinae_BAMM_object), "phylo")) root_age <- max(phytools::nodeHeights(Ponerinae_BAMM_object)[,2]) # Remove temporary "phylo" class class(Ponerinae_BAMM_object) <- setdiff(class(Ponerinae_BAMM_object), "phylo") ## Plot diversification rates on the initial tree plot_BAMM_rates(Ponerinae_BAMM_object, legend = TRUE, labels = FALSE) abline(v = root_age - focal_time, col = "red", lty = 2, lwd = 2) ## Plot diversification rates and regime shifts on the updated tree (cut-off for 10 My) # Keep the initial color scheme plot_BAMM_rates(Ponerinae_BAMM_object_10My, legend = TRUE, labels = FALSE, colorbreaks = Ponerinae_BAMM_object_10My$initial_colorbreaks$net_diversification) # Use a new color scheme mapped on the new distribution of rates plot_BAMM_rates(Ponerinae_BAMM_object_10My, legend = TRUE, labels = FALSE) }# ----- Example 1: Extant whales (87 taxa) ----- # ## Load the BAMM_object summarizing 1000 posterior samples of BAMM data(whale_BAMM_object, package = "deepSTRAPP") ## Set focal-time to 5 My focal_time = 5 # (May take several minutes to run) ## Update the BAMM object whale_BAMM_object_5My <- update_rates_and_regimes_for_focal_time( BAMM_object = whale_BAMM_object, focal_time = 5, update_rates = TRUE, update_regimes = TRUE, update_tree = TRUE, update_plot = TRUE, update_all_elements = TRUE, keep_tip_labels = TRUE, verbose = TRUE) # Add "phylo" class to be compatible with phytools::nodeHeights() class(whale_BAMM_object) <- unique(c(class(whale_BAMM_object), "phylo")) root_age <- max(phytools::nodeHeights(whale_BAMM_object)[,2]) # Remove temporary "phylo" class class(whale_BAMM_object) <- setdiff(class(whale_BAMM_object), "phylo") ## Plot initial BAMM_object for t = 0 My plot_BAMM_rates(whale_BAMM_object, add_regime_shifts = TRUE, labels = TRUE, legend = TRUE, cex = 0.5, par.reset = FALSE) # Keep plotting parameters in memory to use abline(). abline(v = root_age - focal_time, col = "red", lty = 2, lwd = 2) ## Plot updated BAMM_object for t = 5 My plot_BAMM_rates(whale_BAMM_object_5My, add_regime_shifts = TRUE, labels = TRUE, legend = TRUE, cex = 0.8) # ----- Example 2: Extant Ponerinae (1,534 taxa) ----- # if (deepSTRAPP::is_dev_version()) { ## Load the BAMM_object summarizing 1000 posterior samples of BAMM data(Ponerinae_BAMM_object, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Set focal-time to 10 My focal_time = 10 # (May take several minutes to run) ## Update the BAMM object Ponerinae_BAMM_object_10My <- update_rates_and_regimes_for_focal_time( BAMM_object = Ponerinae_BAMM_object, focal_time = focal_time, update_rates = TRUE, update_regimes = TRUE, update_tree = TRUE, update_plot = TRUE, update_all_elements = TRUE, keep_tip_labels = TRUE, verbose = TRUE) ## Load results to save time data(Ponerinae_BAMM_object_10My, package = "deepSTRAPP") ## This dataset is only available in development versions installed from GitHub. # It is not available in CRAN versions. # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version. ## Extract root age # Add temporarily the "phylo" class to be compatible with phytools::nodeHeights() class(Ponerinae_BAMM_object) <- unique(c(class(Ponerinae_BAMM_object), "phylo")) root_age <- max(phytools::nodeHeights(Ponerinae_BAMM_object)[,2]) # Remove temporary "phylo" class class(Ponerinae_BAMM_object) <- setdiff(class(Ponerinae_BAMM_object), "phylo") ## Plot diversification rates on the initial tree plot_BAMM_rates(Ponerinae_BAMM_object, legend = TRUE, labels = FALSE) abline(v = root_age - focal_time, col = "red", lty = 2, lwd = 2) ## Plot diversification rates and regime shifts on the updated tree (cut-off for 10 My) # Keep the initial color scheme plot_BAMM_rates(Ponerinae_BAMM_object_10My, legend = TRUE, labels = FALSE, colorbreaks = Ponerinae_BAMM_object_10My$initial_colorbreaks$net_diversification) # Use a new color scheme mapped on the new distribution of rates plot_BAMM_rates(Ponerinae_BAMM_object_10My, legend = TRUE, labels = FALSE) }
An object of class "bammdata" containing information of diversification dynamics
of extant whales (Cetacea order) modeled with BAMM.
Source: Steeman, M. E., M. B. Hebsgaard, R. E. Fordyce, S. Y. W. Ho, D. L. Rabosky, R. Nielsen, C. Rahbek, H. Glenner, M. V. Sorensen, and E. Willerslev (2009) Radiation of extant cetaceans driven by restructuring of the oceans. Systematic Biology, 58, 573-585.
data(whale_BAMM_object)data(whale_BAMM_object)
A list with 24 elements.
An object of class "bammdata" containing information of diversification dynamics
of extant ponerine ants (Ponerinae subfamily) modeled with BAMM.
Phylogeny-related elements used to plot a phylogeny with ape::plot.phylo():
$edge Matrix of integers. Defines the tree topology by providing rootward and tipward node ID of each edge.
$Nnode Integer. Number of internal nodes.
$tip.label Vector of character strings. Labels of all tips.
$edge.length Vector of numerical. Length of edges/branches.
BAMM internal elements used for tree exploration:
$begin Vector of numerical. Absolute time since root of edge/branch start (rootward).
$end Vector of numerical. Absolute time since root of edge/branch end (tipward).
$downseq Vector of integers. Order of node visits when using a pre-order tree traversal.
$lastvisit ID of the last node visited when starting from the node in the corresponding position in $downseq.
BAMM elements summarizing diversification data:
$numberEvents Vector of integer. Number of events/macroevolutionary regimes (k+1) recorded in each posterior configuration. k = number of shifts.
$eventData List of data.frames. One per posterior sample. Records shift events and macroevolutionary regimes parameters. 1st line = Background root regime.
$eventVectors List of integer vectors. One per posterior sample. Record regime ID per branches.
$tipStates List of integer vectors. One per posterior sample. Record regime ID per tips.
$tipLambda List of numerical vectors. One per posterior sample. Record speciation rates per tips.
$tipMu List of numerical vectors. One per posterior sample. Record extinction rates per tips.
$eventBranchSegs List of matrix of numerical. One per posterior sample. Record regime ID per segments of branches.
$meanTipLambda Vector of numerical. Mean tip speciation rates across all posterior configurations of tips.
$meanTipMu Vector of numerical. Mean tip extinction rates across all posterior configurations of tips.
$type Character string. Set the type of data modeled with BAMM. Here, type = "diversification".
Additional elements providing key information for downstream analyses:
$expectedNumberOfShifts Integer. The expected number of regime shifts used to set the prior in BAMM.
$MSP_tree Object of class phylo. List of 4 elements duplicating information from the Phylogeny-related elements above,
except $MSP_tree$edge.length is recording the Marginal Shift Probability of each branch (i.e., the probability of a regime shift to occur along each branch)
$MAP_indices Vector of integers. The indices of the Maximum A Posteriori probability (MAP) configurations among the posterior samples.
$MAP_BAMM_object. List of 18 elements of class '"bammdata" recording the mean rates and regime shift locations found across
the Maximum A Posteriori probability (MAP) configurations. All BAMM elements summarizing diversification data holds a single entry describing
this mean diversification history.
$MSC_indices Vector of integers. The indices of the Maximum Shift Credibility (MSC) configurations among the posterior samples.
$MSC_BAMM_object List of 18 elements of class '"bammdata" recording the mean rates and regime shift locations found across
the Maximum Shift Credibility (MSC) configurations. All BAMM elements summarizing diversification data holds a single entry describing
this mean diversification history.
Steeman, M. E., M. B. Hebsgaard, R. E. Fordyce, S. Y. W. Ho, D. L. Rabosky, R. Nielsen, C. Rahbek, H. Glenner, M. V. Sorensen, and E. Willerslev (2009) Radiation of extant cetaceans driven by restructuring of the oceans. Systematic Biology, 58, 573-585.
BAMM software website: http://bamm-project.org/