Next-generation phylogenomics

Biology Direct

Table 1 Comparison of key features between phylogenomic approaches based on multiple sequence alignment and alignment-free approaches

Approach based on multiple sequence alignment	Approach based on alignment-free methods
Assumes contiguity (with gaps) of homologous regions	Does not assume contiguity of homologous regions
Based on all possible pairwise comparisons of whole sequences; computationally expensive	Based on occurrences of sub-sequences; computationally inexpensive, can be memory-intensive
Well-established and well-studied approach in phylogenomics	Application in phylogenomics limited; requires further testing for robustness and scalability
More dependent on substitution/evolutionary models	Less dependent on substitution/evolutionary models
More sensitive to stochastic sequence variation, recombination, lateral genetic transfer, rate heterogeneity and sequences of varied lengths, especially when similarity lies in the “twilight zone”	Less sensitive to stochastic sequence variation, recombination, lateral genetic transfer, rate heterogeneity and sequences of varied lengths
Best practice uses inference algorithms with complexity at least O(n²); less time-efficient	Inference algorithms typically O(n²) or less; more time-efficient
Heuristic solutions; statistical significance of how alignment scores relate to homology is difficult to assess	Exact solutions; statistical significance of the sequence distances (and degree of similarity) can be readily assessed

ISSN: 1745-6150