Skip to main content
Fig. 1 | Biology Direct

Fig. 1

From: Analysis of lineage-specific protein family variability in prokaryotes combined with evolutionary reconstructions

Fig. 1

Pipeline for protein variability analysis. Homogeneity values are calculated for each position of multiple alignments of clade-specific COG (csCOG) sequences (top left). Homogeneity profiles along the sequences are smoothed and converted to distributions of the homogeneity values (top middle). Distances between the homogeneity value distributions are used to embed csCOGs into a metric space (top right). Homogeneity values, scaled by the average homogeneity across the clade, are transformed into variabilities (bottom middle). csCOG-specific values form clade-level distributions (bottom left). Position-specific variability values allow to categorize alignment sites into conserved, intermediate, and variable; relative frequency of these classes, plotted on a simplex diagram, identifies csCOG with unusual conservation patterns (bottom right)

Back to article page