Linear correlation between genome size and GC content. Bacterial genome sizes are represented by the number of genes. Bacteria with less than 2,500 genes were chosen for analysis in dnaE1|polV (A) and dnaE3|polV (B), respectively. There is a strong and significant correlation between genome size and GC content after eliminating outliers (red solid circles). R values change from 0.6179 to 0.7479 (p < 0.0001) in the dnaE1|polV group and from 0.5571 to 0.8172 (p < 0.0001) in the dnaE3|polV group. The linear model is Y = 0.0001128X + 0.2387 for the dnaE1|polV group and Y = 0.00006374X + 0.2464 for the dnaE3|polV group. 90% upper and lower prediction limits are also shown. All the numbered outliers were further analyzed to interpret potential reasons underlying this correlation (Table 6).