Pan-genomic analysis to redefine species and subspecies based on quantum discontinuous variation: the Klebsiella paradigm
© Caputo et al. 2015
Received: 29 April 2015
Accepted: 22 September 2015
Published: 30 September 2015
Various methods are currently used to define species and are based on the phylogenetic marker 16S ribosomal RNA gene sequence, DNA-DNA hybridization and DNA GC content. However, these are restricted genetic tools and showed significant limitations.
In this work, we describe an alternative method to build taxonomy by analyzing the pan-genome composition of different species of the Klebsiella genus. Klebsiella species are Gram-negative bacilli belonging to the large Enterobacteriaceae family. Interestingly, when comparing the core/pan-genome ratio; we found a clear discontinuous variation that can define a new species.
Using this pan-genomic approach, we showed that Klebsiella pneumoniae subsp. ozaenae and Klebsiella pneumoniae subsp. rhinoscleromatis are species of the Klebsiella genus, rather than subspecies of Klebsiella pneumoniae. This pan-genomic analysis, helped to develop a new tool for defining species introducing a quantic perspective for taxonomy.
This article was reviewed by William Martin, Pierre Pontarotti and Pere Puigbo (nominated by Dr Yuri Wolf).
KeywordsPan-genome Klebsiella pneumoniae Taxonomy
Set of genes present in more than one strain but not in all strains studied
Genes present in all strains studied
Gene pool present in the genomes of a group of organisms
Homogeneous group of isolates characterized by many common features
Over time, the taxonomy of bacteria has been reorganized based on a combination of phenotypic and genotypic properties . The genotypic criteria by which bacterial species were first characterized included the genomic GC content composition. Later, DNA-DNA hybridization experiments were used for comparisons with the closest phylogenetic neighbors . In the 1990s, the sequencing of the 16S rRNA gene led to a revolution in the classification of bacterial species , enabling the re-classification of living organisms . Currently, a threshold identity of 98.7 % in the 16S rRNA sequence is used to define a new bacterial species [9–11]. Thus, the taxonomic study of the Klebsiella genus, based on 16S rDNA and DNA-DNA hybridization, reclassified K. ozaenae and K. rhinoscleromatis as subspecies of K. pneumonia (Fig. 1).
Recently, improvements in genome sequencing have facilitated the study of bacterial species, particularly by analyzing their taxonomy [2, 12]. Previous studies demonstrated the importance of genomics for bacterial taxonomy by assessing the presence of indels or single nucleotide polymorphisms (SNPs) in conserved genes , comparing orthologous genes  and studying metabolic pathways . In this work, we developed another method to build a taxonomy that takes advantage of genome analysis and pan-genome definition . Indeed, the comparison of the core/pan-genome ratios of the different Klebsiella species revealed that K. pneumoniae subsp. ozaenae and K. pneumoniae subsp. rhinoscleromatis exhibit many differences between themselves as well as with bona fide Klebsiella species. This finding supports the claim that K. pneumoniae subsp. ozaenae and K. pneumoniae subsp. rhinoscleromatis are distinct species of the Klebsiella genus. This work introduces pan-genome analysis as a novel tool to define species and represents a great leap forward in bacterial taxonomy.
Genome sequencing and annotation
Genomes from K. pneumoniae subsp. ozaenae and K. pneumoniae subsp. rhinoscleromatis were sequenced using shotgun sequencing method with IonTorrent_Lifetechnologies and the Roche_454 method. For IonTorrent sequencing, genomic DNA was mechanically fragmented in Covaris microTubes to generate a fragment size distribution from 180 to 220 bp and purified through Ampure Beads (Agencourt, Beckman). The fragmented library was constructed using adaptor ligation according to the manufacturer's instructions (Life Technologies). Template preparation, emulsion PCR and Ion Sphere Particle (ISP) enrichment was performed using the Ion One Touch kit (Life Technologies). The quality of the resulting ISPs was assessed using a Qubit 2.0 Fluorometer (Life Technologies), and samples were loaded twice and sequenced on a 316 chip (Life Technologies). Finally, 3,567,359 reads for K. pneumoniae subsp. ozaenae and 3,325,174 reads for K. pneumoniae subsp. rhinoscleromatis were generated. A 5 kb paired end library was constructed with 5 μg of DNA according to the 454_Titanium paired end protocol and to the manufacturer’s instructions. This was mechanically fragmented using the Covaris device (KBioScience-LGC Genomics, Queens Road, Teddington, Middlesex, TW11 0LY, UK) with miniTUBE-Red 5Kb. DNA fragmentation was viewed using the Agilent 2100 BioAnalyzer on a DNA labchip 7500 with an optimal size of 4.9 kb. Circularization and nebulization were performed on 100 ng of the sample. After PCR amplification through 17 cycles followed by a double size selection, the single-stranded paired end library was then loaded onto a DNA labchip RNA pico 6000 on the BioAnalyzer: the pattern showed an optimum at 573 bp and the concentration was determined at 529 pg/μL. The library concentration equivalence was calculated as 1.69e10 molecules/μL and clonally amplified with 0.13, 0.25, 0.5 and 1 copies per bead (cpb) in 2 emPCR reactions per condition using the GS Titanium SV emPCR Kit (Lib-L) v2. The yields of the emPCR were respectively of 12.43, 15.48, 11.46 and 12.23 %, according to the expected quality of 5–20 % from the Roche procedure. The enriched clonal amplifications were loaded with 790,000 beads on the GS Titanium PicoTiterPlates PTP Kit 70x75 sequenced with the GS Titanium Sequencing Kit XLR70. The runs were performed overnight and were then analyzed on the cluster through the gsRunBrowser and gsAssembler_Roche. We obtained 349,885 total reads for K. pneumoniae subsp. ozaenae and 499,562 reads for K. pneumoniae subsp. rhinoscleromatis. The set of reads obtained from the two different sequencing methods were assembled with the Mira assembler v3.2. . The resulting contigs were combined using Opera software v1.2  in tandem with GapFiller V1.10  to reduce the dataset. Finally, manual refinements were made using CLC Genomics software (CLC bio, Aarhus, Denmark) and homemade tools. These two newly-sequenced genomes were deposited at EMBL-EBI under accession number CDJH00000000 for K. pneumoniae subsp. ozaenae and CDOT00000000 for K. pneumoniae subsp. rhinoscleromatis. For the annotation process, assembled DNA sequences of the new draft genomes were run through various annotation applications including RNAmmer , Prodigal , ARAGORN , Rfam , Pfam , and Infernal .
Genome sequence comparison and pan-genome analysis
General genome features
Species and Subspecies
Genome size (Mb)
GC content (%)
Genome accession no.
Klebsiella pneumoniae pneumoniae
Liu et al. (2012) 
Klebsiella pneumoniae pneumoniae
McClelland et al. (2006) 
Shin et al. (2012) 
Klebsiella pneumoniae pneumoniae
Lin et al. (2012) 
Klebsiella pneumoniae pneumoniae
Wu et al. (2009) 
Klebsiella pneumoniae pneumoniae
Complete (with gaps)
Fookes et al. (2013) 
Klebsiella pneumoniae pneumoniae ozaenae
Drancourt et al. (2001) 
Klebsiella pneumoniae pneumoniae rhinoscleromatis
Pinto-Tomas et al. (2009) 
Liao et al. (2012) 
Shin et al. (2012) 
Diene et al. (2013) 
To functionally annotate protein sequences, we used the WebMGA function prediction workflow  and the NCBI COG database for prokaryotic proteins . All hits below the default RPSBLAST e-value of 1e-03 were reported . We performed a Principal Component Analysis (PCA) for all K. pneumoniae strains of the COG content using the R package (http://CRAN.R-project.org). We assigned KEGG orthology (KO) to the studied protein sequences using the KEGG automatic-annotation server (KAAS)  and mapped the KO-assigned genes to the Kyoto Encyclopedia of Genes and Genomes (KEGG) functional modules .
We determined the pan-genome composition of the six K. pneumoniae strains with and without including one of the other studied genomes K. pneumoniae subsp. ozaenae or K. pneumoniae subsp. rhinoscleromatis or K. variicola or K. oxytoca. Therefore, TBLASTN was performed to search the translated nucleotide database constituted of the different studied genomes using the proteomes as queries . For each query, the query bit score was divided by the maximum bit score for all genomes in order to calculate the Blast Score Ratio (BSR) [40–43] allowing the conservation of peptides in each genome to be defined. Genes with a value of BSR ≥ 0.4 (equivalent to a ≥ 40 % protein identity over 100 % of the protein length) were considered to belong to core. This algorithm allows comparative analysis of multiple proteomes and nucleotide sequence to be performed simultaneously.
Single Nucleotide Polymorphism (SNP) analysis
We identified SNPs among the core genomic regions using the Panseq package [1, 44, 45]. Multiple sequence alignments were built using MEGA 6.06 software  and phylogenies were reconstructed using the maximum likelihood method (PhyML) with 100 bootstrap iterations .
Comparative genomic analysis of Klebsiella pneumoniae genomes
The final draft genome of K. pneumoniae subsp. ozaenae strain ATCC 11296 consists of 23 scaffolds [EMBL: LN681173-LN681195] and 128 contigs, containing 4,955,887 bp and a GC content of 57.5 %. For K. pneumoniae subsp. rhinoscleromatis strain Urmite, the draft genome consisted of 26 scaffolds [EMBL: LN776221-LN776246] and 135 contigs, containing 5,342,094 bp and with a GC content of 57.3 %. The major features of the Klebsiella pneumoniae sequenced genomes are summarized in Table 1.
All the studied K. pneumoniae genomes had an average length of 5.44 Mb. The K. pneumoniae subsp. ozaenae genome was the smallest with only 4.95 Mb and K. pneumoniae subsp. pneumoniae MGH 78578 was the largest genome with 5.69 Mb. The GC content varied from 57.1 % for K. pneumoniae subsp. pneumoniae HS11286 to 57.5 % for K. pneumoniae subsp. ozaenae with an average of 57.3 %. The number of predicted proteins in Klebsiella pneumoniae ranged from 4,818 for K. pneumoniae subsp. ozaenae to 5,779 for K. pneumoniae subsp. pneumoniae MGH 78578. A single ribosomal RNA operon (16S-23S-5S) was predicted for K. pneumoniae subsp. ozaenae and for the other strains, ranging from 8 to 9 operons. The number of tRNAs also differed depending on the species, ranging from 62 tRNA in K. pneumoniae subsp. ozaenae to 87 in K. pneumoniae KCTC 2242. The hierarchical clustering of the strains based on the number of tRNAs showed that K. pneumoniae subsp. ozaenae did not cluster with any other strains (Additional file 1). Altogether, K. pneumoniae subsp. ozaenae had the smallest genome size, number of genes, number of rRNAs and tRNAs among the K. pneumoniae strains. The reduced genome content suggests that K. pneumoniae subsp. ozaenae is more specialized than the other strains [48, 49]. Indeed, the evolution of specialized bacteria consists principally of gene loss , as investigated in particular for Rickettsiales [50, 51].
Pan-genome and taxonomy
The specific genomic features of K. pneumoniae subsp. ozaenae
When compared to the other strains of K. pneumonia, K. pneumonia subsp. ozaenae had fewer annotated proteins in all COG categories (4,572 proteins vs. 5,006 proteins on average) (Additional file 2). K. pneumoniae subsp. ozaenae lacked 202 genes (Additional file 3) that were present in all other Klebsiella strains and possessed 62 genes (Additional file 4) that were absent from all other strains. The missing genes from K. pneumoniae subsp. ozaenae encode for proteins involved in metabolism (13 %), information storage and processing (13 %) and cellular processes (8 %).
Likewise, the KO-annotation using KEGG server showed that K. pneumoniae subsp. ozaenae had fewer proteins (1,454) involved in metabolic pathways than the other K. pneumoniae strains (an average of 1,605 proteins), especially in amino acid metabolism, carbohydrate metabolim and xenobiotics biodegradation and metabolism. The analysis of the KEGG pathways for these genomes showed significant differences between K. pneumoniae subsp. ozaenae and the other K. pneumoniae strains in terms of their carbohydrate metabolism. The starch and sucrose metabolic pathways of K. pneumoniae subsp. ozaenae were deficient in the beta-xylosidase enzyme (EC:220.127.116.11) compared to the other K. pneumoniae strains.
Bacterial taxonomy remains a complex and challenging field . Initially, taxonomy was based on phenotypic criteria  related to a specific biological or medical interest. However, taxonomy has experienced a recent upheaval following the introduction of new genetic techniques. After the advent of DNA-DNA hybridization in 1979 [6, 53] many bacterial species were reclassified or removed from the taxonomic classification. More recently, the 16S rRNA gene has been used for the classification and nomenclature of bacterial species. This method often fails to reflect real distinctions between species . The use of one single universal 16S rRNA gene can hardly be a realistic Tree of Life . Furthermore, the accepted threshold of 1.3 % between two 16S rRNA sequences  required to differentiate between two different bacterial species seems to include almost 50 million years of the molecular clock [55, 56]. If we consider this threshold as the true species definition criterion, no bacterial lineages could have specialized in mammals [1, 57], which is an unacceptable conclusion. Because of the use of these criteria for the definition of bacterial species and the use of restrictive tools, the description of bacteria is very shallow and limited . Bacteria with sympatric lifestyles, a high level of horizontal gene transfer [53, 59], large genomes, a significant number of ribosomal operons  and large pan-genomes [61, 62] compose bacterial species complexes. Only the isolation of a bacterium in a new niche or a significant population reduction will allow the appearance of a ‘specialist’, a bona fide species which will then present an allopatric lifestyle, a smaller genome, a reduced number of ribosomal operons and a smaller pan-genome .
We based our work on the hypothesis that the difference between two species exists as an irreconcilable difference. These species, thus, correspond to two distinct biological entities that could not be confused and could not transform into one another. A new nomenclature therefore needed to be introduced and pan-genomic studies are likely to be the most suitable method for exploring species under this system [44, 63]. Pan-genome study can identify different situations where speciation has occurred. First of all, an extremely broad continuum is defined as an infinite pan-genome, with a low core/pan-genome ratio. This indicates a lack of specialization in a bacterial group and the presence of a species complex or mixture that allows for the genesis of a species rather than a real species. In this context, Shigella can certainly be placed among Escherichia coli species . Nevertheless, Shigella species are irreversibly different from E. coli species in terms of their metabolic, pathophysiological and genetic properties. Shigella spp. are human pathogens, E. coli complex clones, while E. coli strains are mostly commensals of the human intestine presenting a much larger genome repertoire .
In the context of Klebsiella, we began to define species using the pan-genome. The quantum discontinuous variation existing between the Klebsiella pneumoniae pan-genome and the other Klebsiella species shows that a discontinuous variation > = 10 % of the core/pan-genome ratio is observed by adding a single bacterial isolate. This major difference between genomes leads to a break in the ratio. This discontinuous variation corresponds to the start of a new mathematical function as previously described . In a recent study, the best R2 (coefficient of determination) was determined in order to find the most accurate regression type. It has been shown that the addition of 9 Shigella strains to the 42 E. coli strains created a break in the core/pan-genome ratio and showed variation in their trend curve . In quantum physics, such an abrupt change is similar to that of a discontinuous variation. Electrons revolve within discrete orbits. There is no gradual transition from one orbit to another; there are instead quantum discontinuous variations. This quantum phenomenon allows us to distinguish which transitions are progressive and which are quantic. The latter transition type results in the redefinition of species. The pan-genome study and calculation of the core/pan-genome ratio on the genomes of species that are theoretically the same should result in a linear graph. In practice however, we noticed a break event that prompted us to question the definition of a species. Differences between two species would necessarily be a striking phenomenon (ratio differences > 10 %) without a transition zone (Fig. 3) with irreconcilable differences. These physical phenomena fit well the definition of the species. This is not a shift that reflects the natural variability of species, but is instead a distinct biological phenomenon. According to this perspective, the criteria definition based on the species differentiation of Klebsiella pneumoniae enables us to show that Klebsiella ozaenae and Klebsiella rhinoscleromatis, which were initially believed to be individual species [4, 66] and were later considered to be sub-species of Klebsiella pneumoniae , are actually distinct biological entities that should indeed be considered as species. We believe that the emergence of a pan-genome will allow for the development of a more rational approach to species definition, in which species are defined as circumscribed and distinct biological entities with large differences that prevent them from transforming into closely-related species. We acknowledge the fact that pan-genome-based species classification may evolve with the discovery of new isolates. The definition of bacterial speciation, however, should reflect the restricted capacity of the species to obtain new characteristics and to adapt to any ecological changes.
We have proposed a new tool for defining bacterial species using pan-genome analysis. This new method was applied to different species of the Klebsiella genus. We compared the core/pan-genome ratio of different species, which allowed us to take a great discontinuous variation forward in bacterial taxonomy. We found that K. pneumoniae subsp. ozaenae and K. pneumoniae subsp. rhinoscleromatis exhibit as many differences between them as those of Klebsiella genus, and demonstrated that these are distinct species of Klebsiella genus.
We thank the reviewers for their valuable comments and helpful suggestions. We would like to respond and revise our manuscript in light of the reviews.
Reviewer's report 1
Prof. William Martin, Institut of Botanic III, Heinrich-Heine University, Düsseldorf, Germany
This is a very well written and interesting paper. I like it a lot. Few papers deal with species concepts among bacteria in such a relaxed and readable manner. Clearly, for clinical reasons we have to have species so that doctors can tell us what infection we have and how to treat it. Pragmatic approaches to the problem are useful, and this paper makes progress in that direction.line 83. "clear leap". In the vernacular of traditional systematics, this leap is called "discontinuous variation", so the principle has precedent. One might have a read of some classical systematic s papers for other kinds of organisms, following the keyword lead "discontinuous variation" in the lierature, and maybe rethink the title accordingly. Basically this paper suggests using a very traditional criterion with very modern data (pangenomes).
We thank Prof. Martin for his comments on our manuscript. We are pleased that you have enjoyed it. We replaced in this paper the word “leap” by “ discontinuous variation” acording to your advice.
l. 111, define cpb
Cpb means copies per bead, we corrected this on line 108.l. 167, Standard MCL clustering techniques could also be used here instead of blast score ratios.
The Blast Score Ratio is an algorithm that provides information concerning conserved genes between genomes (orthologs), it also shows their level of conservation (lines 146). The threshold used gives us an estimate of genetic variability. This is why we chose to use the BSR instead of standard MCL clustering.
l. 261, worse than the clock issue is that rDNA does not clearly predict what the rest of the genome harbours, as pangenomes and this paper show.
Thank you for your comments.
l. 271 "could not transform into one another" is not a very useful criterion because it makes untestable assumptions about what might happen in the future …
We mean that genomic content reflects the ecosystem. If the bacterium were to change its ecosystem and become specialized, no return would then be possible because no exchange is possible (lines 251, 292).
l. 279, is "irreversibly" the right word here?
Yes, the word is “irreversibly”.
l. 283, here we are getting to the main course of the apper. Maybe explain in more detail what Fig. 7 shows and perhaps find a mathematical decription for the dip ("discontinuous variation") in the c/p ratio that is independent of the value "10 %", which some might think is the sugestion for a pangenome defined species boundary, more studies on other species would be needed to get a better feel.
To clarify, we have reviewed many parts of this paper and discussed more about a mathematical description with an other example of a pan-genomic study, lines 273 to 277.
l. 286, break — > discontinuity
Yes, “break” means “discontinuity”
l. 287, nut orbitals are different, because sampling of further atoms will not uncover transitional orbitals, but sampleing of other srains will uncover transitional genomes, probably. But one gets the idea.
We have added another example of pan-genomic study performed in another study line 279.
l. 296, which species definition? its a vast literature.
We gave a prokaryotic species definition on page 12. For more precision, we have added some references (43, 44, 49) on lines 236, 239, 243, 251.
l. 302 f, what we see here is not a clear recommendation of the type that Stackebrandt would issue, but a pleas for the use of pangenomic data for the species question, which is unquestionalby reasonable and likely a fruitful avenue of pursuit.
Thank you for this comment.
l. 314 … demonstrated that these are distinct species of Klebsiella genus at a level of
pangenomic discontinuity that would go undetected in a system rDN-based species definitions. Microbial systematics has always adapted to new technologies as it regards species boundaries, perhaps the next generation of adaptation is upon us now with the availability and utility of pangenomes, at least in the clinical context.
Thank you for this comment.
In summary, this is a very fine paper, I enjoyed it a lot.Quality of written English: Acceptable
Reviewer's report 2
Dr. Pierre Pontarotti, Evolution Biologique et Modélisation, Aix-Marseille University, Marseille, France
The idea proposed in this article i.e. use of the complete genome comparison methodology to define biological species , is really interesting. Therefore, the concept deserves to be published.
However, in the present form the article is really difficult to understand. I recommend that it should be rewritten especially abstract, material and method , result and legend section to make them more precise and understandable.
We thank Dr. Pontarotti for his comments. We have rewritten parts of our manuscript as recommended.
Concerning the discussion about the quantum leap, the author should discuss the possibility of intermediate species, that are not yet described. In other words, the quantum leap could be due to missing data.
The authors proposal remind me of punctual equilibrium theory from Elderdge and Gould which is based in part on fossil records. One of the argument against their theory was the possibility of missing fossils.
We acknowledge the fact that all Klebsiella species might not be yet known and therefore the discovery of future isolates may modify a little the proposed classification. We added this comment on the discussion (line 300).
Quality of written English: Needs some language corrections before being published.
Reviewer's report 3
Dr. Pere Puigbo (nominated by an Editorial Board member, Dr Yuri Wolf), NCBI, NIH, Bethesda, USA
This article presents an interesting framework to handle the problem of defining (quantitatively) prokaryotic species. The authors use the simple, yet apparently efficient, core/pan-genome (C/P) ratio to define species of the genus Klebsiella. Overall, this ratio has the potential to be a useful tool to classify prokaryotic species. However, I think the article opens several technical and conceptual questions that may be addressed here.
- The authors tested the C/P ratio on Klebsiella species, but how it will perform in other species is still uncertain (e.g., intracellular parasites). Moreover, it would be very useful to see an example without a predefined group of closely related species to evaluate the real potential of this ratio in prokaryotic classification
We thank Dr. Puigbo for his comments concerning our manuscript. An identical study has been already performed on other species in our lab (ref. 34). I added these results to the discussion section line 277.
- I think the “quantum leap” and the threshold identified in Klebsiella (>10 %) needs some randomization test and further exploration in other species. This introduces questions on how to use the C/P ratio: 1) is there any “golden threshold” that can be used across different taxonomical groups? 2) How is this threshold affected by genome reduction and horizontal gene transfer?
In a previous study, Rouli et al. (ref. 34), using a similar approach, observed that the C/P ratio varied from genus to genus and increased when genome size decreased. The influence of horizontal gene transfer on the C/P ratio, however, remains to be determined.- The definition of species in page 3 is quite vague. It is improved on page 13, when the authors define the working hypothesis. However, I feel the article is missing a longer discussion about the meaning of ‘prokaryotic species’. It might be useful to expand this section and include additional references (e.g., PMID19411599, PMID21943000, PMID21714936)
We have clarified the definition on page 3, on discussion page 12 and we have added the 3 references mentioned.
Quality of written English:
Single nucleotide polymorphisms
Ion sphere particle
Personal genome machine
Copies per bead
Coding DNA sequences
Clusters of orthologs groups
Principal component analysis
Kyoto encyclopedia of genes and genomes
KEGG automatic annotation server
Blast score ratio
This work was funded by IHU Méditerranée Infection.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Georgiades K, Raoult D. Defining pathogenic bacterial species in the genomic era. Front Microbiol. 2010;1:151.PubMed CentralPubMedGoogle Scholar
- Diene SM, Merhej V, Henry M, Filali AE, Roux V, Robert C, et al. The Rhizome of the Multidrug-Resistant Enterobacter aerogenes Genome Reveals How New “Killer Bugs” Are Created because of a Sympatric Lifestyle. Mol Biol Evol. 2013;30(2):369–83.View ArticlePubMedGoogle Scholar
- Podschun R, Ullmann U. Klebsiella spp. as nosocomial pathogens: epidemiology, taxonomy, typing methods, and pathogenicity factors. Clin Microbiol Rev. 1998;11:589–603.
- Bascomb S, Lapage SP, Willcox WR, Curtis MA. Numerical classification of the tribe Klebsielleae. J Gen Microbiol. 1971;66:279–95.
- Staley JT. The bacterial species dilemma and the genomic-phylogenetic species concept. Philos Trans R Soc B Biol Sci. 2006;361:1899–909.
- Wayne LG, Brenner DJ, Colwell RR, Grimont PD, Kandler O, Krichevsky MI, et al. Report of the Ad Hoc Committee on Reconciliation of Approaches to Bacterial Systematics. Int J Syst Bacteriol. 1987;37:463–4.View ArticleGoogle Scholar
- Coenye T, Vandamme P. Extracting phylogenetic information from whole-genome sequencing projects: the lactic acid bacteria as a test case. Microbiol Read Engl. 2003;149(Pt 12):3507–17.View ArticleGoogle Scholar
- Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A. 1990;87:4576–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Stackebrandt E, Ebers J. Taxonomic parameters revisited: tarnished gold standards. Microbiology today. 2006;33(4):152–5.Google Scholar
- Lagier JC, Hugon P, Khelaifia S, Fournier PE, La Scola B, Raoult D. The Rebirth of Culture in Microbiology through the Example of Culturomics To Study Human Gut Microbiota. Clin Microbiol Rev. 2015;28.
- Lagier JC, Edouard S, Pagnier I, Mediannikov O, Drancourt M, Raoult D. Current and Past Strategies for Bacterial Culture in Clinical Microbiology. Clin Microbiol Rev. 2015;28:208–36.View ArticlePubMedGoogle Scholar
- Fitz-Gibbon ST, House CH. Whole genome-based phylogenetic analysis of free-living microorganisms. Nucleic Acids Res. 1999;27:4218–22.PubMed CentralView ArticlePubMedGoogle Scholar
- Gupta RS. The branching order and phylogenetic placement of species from completed bacterial genomes, based on conserved indels found in various proteins. Int Microbiol Off J Span Soc Microbiol. 2001;4:187–202.Google Scholar
- Huson DH, Steel M. Phylogenetic trees based on gene content. Bioinforma Oxf Engl. 2004;20:2044–9.View ArticleGoogle Scholar
- Rouli L, Mbengue M, Robert C, Ndiaye M, La Scola B, Raoult D. Genomic analysis of three African strains of Bacillus anthracis demonstrates that they are part of the clonal expansion of an exclusively pathogenic bacterium. New Microbes New Infect. 2014;2:161–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Chevreux B, Pfisterer T, Drescher B, Driesel AJ, Müller WEG, Wetter T, et al. Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res. 2004;14:1147–59.PubMed CentralView ArticlePubMedGoogle Scholar
- Gao S, Sung WK, Nagarajan N. Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences. J Comput Biol J Comput Mol Cell Biol. 2011;18:1681–91.View ArticleGoogle Scholar
- Boetzer M, Pirovano W. Toward almost closed genomes with GapFiller. Genome Biol. 2012;13:R56.PubMed CentralView ArticlePubMedGoogle Scholar
- Lagesen K, Hallin P, Rødland EA, Staerfeldt HH, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35:3100–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119.PubMed CentralView ArticlePubMedGoogle Scholar
- Laslett D, Canback B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 2004;32:11–6.PubMed CentralView ArticlePubMedGoogle Scholar
- Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res. 2003;31:439–41.PubMed CentralView ArticlePubMedGoogle Scholar
- Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, et al. The Pfam protein families database. Nucleic Acids Res. 2012;40(Database issue):D290–301.PubMed CentralView ArticlePubMedGoogle Scholar
- Nawrocki EP, Kolbe DL, Eddy SR. Infernal 1.0: inference of RNA alignments. Bioinforma Oxf Engl. 2009;25:1335–7.View ArticleGoogle Scholar
- Liu P, Li P, Jiang X, Bi D, Xie Y, Tai C, Deng Z, Rajakumar K, Ou HY. Complete genome sequence of Klebsiella pneumoniae subsp. pneumoniae HS11286, a multidrug-resistant strain isolated from human sputum. J Bacteriol. 2012;194:1841–1842.PubMed CentralView ArticlePubMedGoogle Scholar
- McClelland M, Sanderson EK, Spieth J, Clifton WS, Latreille P, et al. The Klebsiella pneumonia Genome Sequencing. 2006.
- Lin AC, Liao TL, Lin YC, Lai YC, Lu MC, Chen YT. Complete genome sequence of Klebsiella pneumoniae 1084, a hypermucoviscosity-negative K1 clinical strain. J Bacteriol. 2012;194:6316.PubMed CentralView ArticlePubMedGoogle Scholar
- Wu KM, Li LH, Yan JJ, Tsao N, Liao TL, Tsai HC, et al. Genome sequencing and comparative analysis of Klebsiella pneumoniae NTUH-K2044, a strain causing liver abscess and meningitis. J Bacteriol. 2009;191:4492–4501.PubMed CentralView ArticlePubMedGoogle Scholar
- Fookes M, Yu J, De Majumdar S, Thomson N, Schneiders T. Genome sequence of Klebsiella pneumoniae Ecl8, a reference strain for targeted genetic manipulation. Genome Announc 2013;1. doi:10.1128/genomeA.00027-12.
- Shin SH, Kim S, Kim JY, Lee S, Um Y, Oh MK, et al. Complete genome sequence of the 2,3-butanediol-producing Klebsiella pneumoniae strain KCTC 2242. J Bacteriol. 2012;194:2736–2737.PubMed CentralView ArticlePubMedGoogle Scholar
- Liao TL, Lin AC, Chen E, Huang TW, Liu YM, Chang YH, et al. Complete genome sequence of Klebsiella oxytoca E718, a New Delhi metallo-ß-lactamase-1-producing nosocomial strain. J Bacteriol. 2012;194:5454.PubMed CentralView ArticlePubMedGoogle Scholar
- Shin SH, Kim S, Kim JY, Lee S, Um Y, Oh MK, et al. Complete genome sequence of Klebsiella oxytoca KCTC 1686, used in production of 2,3-butanediol. J Bacteriol. 2012;194:2371–2372.PubMed CentralView ArticlePubMedGoogle Scholar
- Pinto-Tomás AA, Anderson MA, Suen G, Stevenson DM, Chu FST, Cleland WW, et al. Symbiotic nitrogen fixation in the fungus gardens of leaf-cutter ants. Science. 2009;326:1120–1123.View ArticlePubMedGoogle Scholar
- Drancourt M, Bollet C, Carta A, Rousselier P. Phylogenetic analyses of Klebsiella species delineate Klebsiella and Raoultella gen. nov., with description of Raoultella ornithinolytica comb. nov., Raoultella terrigena comb. nov. and Raoultella planticola comb. nov. Int J Syst Evol Microbiol. 2001; 51:925–932.View ArticlePubMedGoogle Scholar
- Wu S, Zhu Z, Fu L, Niu B, Li W. WebMGA: a customizable web server for fast metagenomic sequence analysis. BMC Genomics. 2011;12:444.PubMed CentralView ArticlePubMedGoogle Scholar
- Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, et al. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001;29:22–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402.PubMed CentralView ArticlePubMedGoogle Scholar
- Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007;35 suppl 2:W182–5.PubMed CentralView ArticlePubMedGoogle Scholar
- Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30.PubMed CentralView ArticlePubMedGoogle Scholar
- Rasko DA, Myers GSA, Ravel J. Visualization of comparative genomic analyses by BLAST score ratio. BMC Bioinformatics. 2005;6:2.PubMed CentralView ArticlePubMedGoogle Scholar
- Pearson T, Hornstra HM, Sahl JW, Schaack S, Schupp JM, Beckstrom-Sternberg SM, et al. When outgroups fail; phylogenomics of rooting the emerging pathogen, Coxiella burnetii. Syst Biol. 2013;62:752–62.PubMed CentralView ArticlePubMedGoogle Scholar
- Sahl JW, Gillece JD, Schupp JM, Waddell VG, Driebe EM, Engelthaler DM, et al. Evolution of a pathogen: a comparative genomics analysis identifies a genetic pathway to pathogenesis in Acinetobacter. PloS One. 2013;8:e54287.PubMed CentralView ArticlePubMedGoogle Scholar
- D’Amato F, Eldin C, Georgiades K, Edouard S, Delerce J, Labas N, et al. Loss of TSS1 in hypervirulent Coxiella burnetii 175, the causative agent of Q fever in French Guiana. Comp Immunol Microbiol Infect Dis. 2015;41:35–41.View ArticlePubMedGoogle Scholar
- Rouli L, Merhej V, Fournier PE, Raoult D: The bacterial pangenome as a new tool for analyzing pathogenic bacteria. New Microbes New Infect 2015;7:72–85.
- Laing C, Buchanan C, Taboada EN, Zhang Y, Kropinski A, Villegas A, et al. Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions. BMC Bioinformatics. 2010;11:461.PubMed CentralView ArticlePubMedGoogle Scholar
- Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol. 2013;30:2725–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0. Syst Biol. 2010;59:307–21.View ArticlePubMedGoogle Scholar
- Merhej V, Royer-Carenzi M, Pontarotti P, Raoult D. Massive comparative genomic analysis reveals convergent evolution of specialized bacteria. Biol Direct. 2009;4:13.PubMed CentralView ArticlePubMedGoogle Scholar
- Rolain JM, Vayssier-Taussat M, Saisongkorh W, Merhej V, Gimenez G, Robert C, et al. Partial Disruption of Translational and Posttranslational Machinery Reshapes Growth Rates of Bartonella birtlesii. mBio. 2013;4:e00115–13.PubMed CentralView ArticlePubMedGoogle Scholar
- Darby AC, Cho NH, Fuxelius HH, Westberg J, Andersson SGE. Intracellular pathogens go extreme: genome evolution in the Rickettsiales. Trends Genet. 2007;23:511–20.View ArticlePubMedGoogle Scholar
- Merhej V, Georgiades K, Raoult D. Postgenomic analysis of bacterial pathogens repertoire reveals genome reduction rather than virulence factors. Brief Funct Genomics. 2013;12:291–304.View ArticlePubMedGoogle Scholar
- Darling AE, Mau B, Perna NT. progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement. PLoS ONE. 2010;5:e11147.PubMed CentralView ArticlePubMedGoogle Scholar
- Doolittle WF, Zhaxybayeva O. On the origin of prokaryotic species. Genome Res. 2009;19:744–56.View ArticlePubMedGoogle Scholar
- O’Malley MA, Koonin EV. How stands the Tree of Life a century and a half after The Origin? Biol Direct. 2011;6:32.PubMed CentralView ArticlePubMedGoogle Scholar
- Ochman H, Elwyn S, Moran NA. Calibrating bacterial evolution. Proc Natl Acad Sci U S A. 1999;96:12638–43.PubMed CentralView ArticlePubMedGoogle Scholar
- Ogata H, Audic S, Renesto-Audiffren P, Fournier PE, Barbe V, Samson D, et al. Mechanisms of evolution in Rickettsia conorii and R. prowazekii. Science. 2001;293:2093–8.View ArticlePubMedGoogle Scholar
- Georgiades K, Merhej V, Raoult D. The influence of rickettsiologists on post-modern microbiology. Front Cell Infect Microbiol. 2011;1:8.PubMed CentralPubMedGoogle Scholar
- Rosselló-Mora R. DNA-DNA Reassociation Methods Applied to Microbial Taxonomy and Their Critical Evaluation. In: Stackebrandt PDE, editor. Molecular Identification, Systematics, and Population Structure of Prokaryotes. Berlin Heidelberg: Springer; 2006. p. 23–50.View ArticleGoogle Scholar
- Andam CP, Gogarten JP. Biased gene transfer and its implications for the concept of lineage. Biol Direct. 2011;6:47.PubMed CentralView ArticlePubMedGoogle Scholar
- Audic S, Robert C, Campagna B, Parinello H, Claverie J-M, Raoult D, et al. Genome analysis of Minibacterium massiliensis highlights the convergent evolution of water-living bacteria. PLoS Genet. 2007;3, e138.PubMed CentralView ArticlePubMedGoogle Scholar
- Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome.”. Proc Natl Acad Sci U S A. 2005;102:13950–5.PubMed CentralView ArticlePubMedGoogle Scholar
- Via S. Natural selection in action during speciation. Proc Natl Acad Sci U S A. 2009;106 Suppl 1:9939–46.PubMed CentralView ArticlePubMedGoogle Scholar
- Tettelin H, Riley D, Cattuto C, Medini D. Comparative genomics: the bacterial pan-genome. Curr Opin Microbiol. 2008;11:472–7 [Antimicrobials/Genomics].View ArticlePubMedGoogle Scholar
- Pupo GM, Lan R, Reeves PR. Multiple independent origins of Shigella clones of Escherichia coli and convergent evolution of many of their characteristics. Proc Natl Acad Sci U S A. 2000;97:10567–72.PubMed CentralView ArticlePubMedGoogle Scholar
- Maurelli AT, Routh PR, Dillman RC, Ficken MD, Weinstock DM, Almond GW, et al. Shigella infection as observed in the experimentally inoculated domestic pig, Sus scrofa domestica. Microb Pathog. 1998;25:189–96.View ArticlePubMedGoogle Scholar
- Cowan ST, Steel M, Shaw C, Duguid JP. A classification of the Klebsiella group. J Gen Microbiol. 1960;23:601–12.View ArticlePubMedGoogle Scholar
- Ørskov, I. Genus v. Klebsiella. In: N. R. Krieg and J. G. Holt, editors. Bergey's manual of systematic bacteriology, vol. 1. Baltimore, Md: Williams & Wilkins; 1984. p. 461–465.