Skip to main content

Advertisement

Fig. 3 | Biology Direct

Fig. 3

From: Identification of positive selection in genes is greatly improved by using experimentally informed site-specific models

Fig. 3

The experimentally informed models (ExpCM) identify many sites of diversifying or differential selection that are missed by a standard d N/d S analysis (GY94). a The violin plots show the distribution of P-values that a site is under diversifying selection for (positive numbers) or against (negative numbers) amino-acid change (ω r indicates both the ExpCM parameter in Eq. 5 and the GY94 d N/d S ratio). The portion of the distribution above / below the dotted blue lines contains all sites for which there is support for rejecting the null hypothesis ω r =1 at a FDR of 0.05. When there are no sites with support at this FDR, the dotted blue lines indicate the P-value that would be needed for a site to have ω r >1 or <1 at a significance level of 0.05 using a Bonferroni correction. The d N/d S method identifies many sites of purifying selection, but fails to find any sites of selection for amino-acid change. The ExpCM model already accounts for basic functional constraints and so doesn’t identify any sites with ω r <1, but does identify sites of diversifying selection in all genes except Gal4 (which is not thought to evolve under pressure for phenotypic change). b The violin plots shown the distribution of differential selection at each site inferred with the ExpCM. Since Gal4 is not under selection for phenotypic change, I defined a heuristic threshold at 2-times the Gal4 maximum value of 0.27. At this threshold, sites of differential selection are identified for all three other genes. The legend labels all sites under diversifying or differential selection. This analysis was performed using phydms; Additional file 17 shows that similar results are obtained if the d N/d S analysis is instead performed using HyPhy [7]

Back to article page