Length distribution for predicted complete active toxins in different bacterial clades. Complete active toxins, as against cassettes, were identified based on characteristic marker domains for each of the distinct secretory systems associated with the toxin either in the same polypeptide or in gene neighborhoods (Table 1). The topmost row shows the combined statistics for all active toxins while other panels present the breakdown of these distributions based on secretory bacterial clades. The toxin length distribution is represented as beanplot (e.g. left panel in the first row) and a raw histogram (top row, central panel) and clearly indicates the multimodal nature of toxin length. The barplot on the first row (rightmost panel) shows the frequencies of consecutive toxin and/or immunity gene pairs in theses genomes. Only pairs of gene encoded by the same strand where considered. The labels indicate whether an immunity protein (I) or a toxin (T) is encoded upstream or downstream of its neighbor in putative operons, e.g. TI corresponds to a pair where an immunity gene is preceded by a toxin gene. Note that the TI (toxin - > immunity) architecture is the most frequent pair observed in all graphs except for bacteroidetes/chlorobi and firmicutes, where the presence of polyimmunity loci inflates the II category. Dashed vertical lines correspond to the median protein length for the data on each panel, and the solid vertical lines over each beanplot correspond to the median length in that secretory system alone. The axes at the right of each panel contain the number of active toxins per secretory system.