Genome-wide analysis of transposable elements and tandem repeats in the compact placozoan genome
© Wang et al; licensee BioMed Central Ltd. 2010
Received: 8 April 2010
Accepted: 15 April 2010
Published: 15 April 2010
The placozoan Trichoplax adhaerens has a compact genome with many primitive eumetazoan characteristics. In order to gain a better understanding of its genome architecture, we conducted a detailed analysis of repeat content in this genome. The transposable element (TE) content is lower than that of other metazoans, and the few TEs present in the genome appear to be inactive. A new phylogenetic clade of the gypsy-like LTR retrotransposons was identified, which includes the majority of gypsy-like elements in Trichoplax. A particular microsatellite motif (ACAGT) exhibits unexpectedly high abundance, and also has strong association with its nearby genes.
This article was reviewed by Dr. Jerzy Jurka and Dr. I. King Jordan.
Placozoans are arguably the simplest free-living multicelluar animals, and may represent an extant example of the ancestral metazoan body plan . A recent comprehensive phylogenetic study suggests that placozoans are basal relatives to all other non-Bilaterian animals [, but see ]. It has been suggested that the placozoan Trichoplax adhaerens is an excellent model for the study of early evolution of metazoans [4, 5]. The recent analysis of the Trichoplax genome has revealed a lack of the frequent intron loss and genomic rearrangement that characterize other small metazoan genomes (e.g. flies and worms), and many structural aspects (e.g. introns, local gene order and larger-scale linkages) are thought to represent ancestral eumetazoan characteristics . In order to gain a better understanding of the evolution of the Trichoplax genome architecture, it may be interesting to investigate the abundance and types of repetitive sequences in the Trichoplax genome, because repetitive sequences, especially transposable elements (TEs), are major evolutionary contributors that drive genome evolution by enhancing genome plasticity [6–8].
Summary of repeat content in the placozoan and other metazoan genomes.
Genome size (Mb)
Transposable element (TE)
Major SSR motif
Classification of transposable elements (TEs) in the Trichoplax genome.
Matching Length (bp)
Minisatellites (repeat units usually within 7~2000 bp) were detected using the program Tandem Repeat Finder 4.03 . In total, 9208 minisatellites with repeat units ranging from 7 bp to 1204 bp were identified in the Trichoplax genome, and accounted for 2.4% of the genome size. In general, the smaller the repeat unit, the higher the repeat abundance (additional file 6a). Minisatellites with repeat units ranging from 7 bp to 25 bp contained highly abundant of repeats (usually >100 repeats). Similar distribution patterns have previously been observed in a scallop genome . The average copy number of these repeats was generally low in the Trichoplax genome (< 4 copies for 94% of minisatellite repeat types) (additional file 6b), which may also account for the lack of frequent genomic rearrangements in the Trichoplax genome.
In summary, we conducted a detailed analysis of repeat content in the Trichoplax genome. The TEs in the Trichoplax genome are scarce and apparently lack functional activity. A new phylogenetic clade (Tag) of the gypsy-like LTR retrotransposons was identified. The unexpectedly high abundance of ACAGT motif in the Trichoplax genome represents an intriguing topic for future investigations of its potential roles in animal development and genome evolution.
Reviewer 1 (Dr. Jerzy Jurka, Genetic Information Research Institute, USA)This paper analyzes repetitive DNA in an interesting metazoan. Of particular interest is the predominance of the (ACAGT)n microsatellite. However, it is difficult to evaluate the analysis of TEs. The authors should include all the identified TE sequences in a supplemental file.
Authors' response: Done. All identified TE sequences and annotations are included in the additional file 2.
Furthermore, they should include analysis of non-autonomous elements if they are present. The paper needs a second review.
Authors' response: We have used the MUST program to search for miniature inverted-repeat transposable elements (MITEs), which belong to nonautonomous DNA transposons, and are usually present in eukaryotic genomes in very high copy numbers. We considered a potential MITE family by requiring at least 3 elements in this family with the same perfect TIRs and TSDs. MITEs turned out to be quite rare in the genome, with only a single family represented by ~20 copies altogether. This is anticipated since no functional autonomous TEs, upon which MITEs rely for their transposition and persistence, were identified in the Trichoplax genome.
Reviewer 1's second review:
Here are putative non-autonomous DNA transposons published in October issue of Repbase Reports.
Authors' response: We appreciate the reviewer kindly providing this information. It is now mentioned and cited in the revised MS.
Reviewer 2 (Dr. I. King Jordan, Georgia Institute of Technology, USA)This manuscript describes an analysis of the short tandem repeat and long interspersed repeat, i.e. transposable element (TE), content of the Trichoplax adhaerens genome. An analysis of the repeat content of this genome is potentially interesting because the organism has a small compact genome and it occupies a basal position in the eukaryotic phylogenetic tree. Indeed, it has been claimed previously that Trichoplax likely resembles an ancestral eukaryotic genome. As such, genomic studies of this organism may reveal insight into the origin and evolution of eukaryotic genomes. The paper reports on a straightforward analysis, and it is well written and easy to follow. However, it is not clear what truly new or relevant insight into eukaryotic genome evolution is provided by these data. In addition, the methods used to search for repeats are not sufficiently rigorous to justify the conclusions that are made regarding the repeat content of the genome. I elaborate on these points and provide more specific comments below.
The most pressing point here is related to the authors' contention that Trichoplax represents an ancestral eukaryotic genome and therefore its repeat content can be understood to resemble that of the earliest eukaryotes. The problem is that repeats are notoriously dynamic genomic elements. Tandem repeats are highly unstable, and TEs are typically the most lineage-specific sequences in eukaryotic genomes. In fact, TEs are known quickly evolve beyond the ability to be recognized with homology based methods. Thus, the interspersed repeats that exist in the genome today, in particular those that can be found by the methods used here, were certainly not around in at the origin of the eukaryotes. Indeed, many of the TEs identified here seem to have been recently acquired and are rapidly decaying. It does not seem possible, based on the analysis of a single genome as reported here, to determine whether the low repeat content of Tichoplax is due to low repeats in the eukaryotic ancestor or secondary loss of repeats and genome streamlining over time.
Authors' response: We agree with this comment. The corresponding discussion has been removed from the MS.
Given the small size of the Trichoplax genome, along with its basal phylogenetic position in the eukaryotic tree, it is perhaps unsurprising that the authors turn up so few TEs. However, the overall lack of TEs reported here places a burden of proof on the authors that has not been met. It is up to them to demonstrate that they have exhaustively searched the genome sequence for TEs using a wide variety of available methods. The report indicates that BLAST was used to search for encoded protein sequences and a single ab initio method was used to search for MITEs. There are of course numerous tools available to search for TEs and repeats in genome sequences (e.g. see Bergman and Quesneville 2007 Brief Bioinform 8: 382). In fact, the most rigorous efforts at genome annotation now involve pipelines that combine the use of many tools - including both homology based detection methods based on comparisons between genome sequences and TE consensus sequences and ab initio methods that rely on specific structural features of the elements (e.g. see Estill and Bennetzen 2009 Plant Methods 5: 8; Quesneville et al. 2005 PLoS Comput Biol. 1: 166). A deeper analysis of the repeat content of this genome would require such a combined approach.
Authors' response: The low TE content of the Trichoplax genome has been first reported in the previous study  where 665 putative TEs were identified using the RepeatMasker program although no data curation and TE characterization were performed. In a search for nonautonomous TEs in the Trichoplax genome, only five putative nonautonomous DNA transposons with low copy numbers have been identified so far . Besides our tblastx comparison and MITE analysis, we also performed an additional analysis to search for novel LTR retrotransposons based on their structural features. Ten putative LTR retrotransposons were identified using the LTR_FINDER program , and none of them showed protein homology to known LTR retrotransposons or other TEs. Moreover, none of them were present in the genome with more than two copies, so it looks unlikely that these elements are true LTR retrotransposons. Overall, we conclude that the TE content of the Trichoplax genome is indeed very low, and that this observation is robust across a variety of methods.
Only a cursory description of the methods used to search for repeats are provided in the body of the manuscript. An additional description of the methods was provided by the authors upon request. These methodological details need to be included with the submission (perhaps as a supplement?) so that interested readers can more carefully evaluate the research design and the results.
Authors' response: We have now provided a complete description of methods in the additional file 1.
There are several statements regarding the relevance and the impact of the findings that are never substantiated or followed up on. For instance, in the abstract the authors state that "the unexpected abundance of [the ACAGT] motif makes this an attractive target for future studies into animal development and genome evolution." And in the body of the manuscript they claim that "Identification of the new phylogenetic clade, Tag may provide new insights into the origin and diversification of the gypsy-like LTR retrotransposons in metazoan genomes." It is not clear what either of these strictly descriptive findings regarding Trichoplax genome repeats reveals about the organisms evolution or development. How does the abundance of a short tandem relate to the development of this organism? What does the discovery of a new gypsy clade, nested squarely within the diversity of existing gypsy-like sequences, tell us about the origin and diversification of the group?
Authors' response: 1) We added more analyses to explore the potential functions of ACAGT motif. We have rewritten the discussion in the light of new results. See the following:
Many studies have shown that microsatellites can serve as transcription factor binding sites (TFBSs) to regulate gene expression [for a review, see ]. In order to evaluate if ACAGT motif could be a potential TFBS, we investigated the association pattern of ACAGT motif and its downstream nearby genes (Fig. 2, more info in additional file 5). Strikingly, 54% and 85% of ACAGT motif located within 1 kb and 5 kb upstream of nearby genes respectively, suggesting the potential role of ACAGT motif in regulation of nearby gene expression. Further gene ontology (GO) enrichment analysis of ACAGT associated genes (GO term level = 6 and distance threshold = 5 kb) revealed that 58 genes significantly enriched in the GO term of translation (adjusted p < 0.021), and 155 in the protein modification process (adjusted p < 0.0064). Most of the translation genes are ribosomal proteins, and most of the protein modification genes are kinases. Kinases are known to regulate the majority of cellular pathways, especially those involved in signal transduction. Previous study has shown that Trichoplax genome encodes a rich array of transcription factors and signaling pathways that are typically associated with eumetazoan developmental patterning and cell-type specification . It would therefore be interesting to explore the potential roles of ACAGT motif in these biological processes in the future.
2) The phylogenetic tree of gypsy group presented in this study is an unrooted tree, and thus no ancestry information could be inferred from this tree. No outgroup was included in the phylogenetic analysis because we could not obtain a well-defined phylogeny based on the limited RT protein sequences when outgroup sequences were added. However, since most of placozoan gypsy-like elements belong to the Tag clade, and this new clade has not been identified before in other metazoan genomes, this may suggest that much of the diversity among gypsy-like clades emerged after the divergence of Trichoplax from other metazoan lineages. It is also possible that Tag clade may represent an ancestral gypsy clade which was still preserved in the Trichoplax genome. Further identification and characterization of full-length elements belonging to the Tag clade from other eukaryotic organisms would help clarify this situation, and may also provide new insights into the origin and diversification of the gypsy-like LTR retrotransposons.
In summary, given the abundance of new genomes that are constantly being sequenced, one has to wonder about the need to publish a description of the repeat content, or other specific aspects of genome architecture, in each case. It would seem that to justify such an exercise, the work must provide some fundamental new insight or at the very least clearly address a specific hypothesis. This report does not meet those standards, and so I am left to wonder as to the potential impact and overall relevance of the work.
Authors' response: In the revised MS, we present new analyses and discussion to expand on the previous statements. We feel that the unusual features of repeat content in Trichoplax are noteworthy, and that our analysis provides a useful review of those features and calls attention to a particular sequence motif that appears to be significantly associated with translation and signaling genes. We hope the reviewer will find that our MS has been improved enough to justify its publication as a Discovery Note.
Reviewer 2's second review:
I am satisfied with the authors' responses to my comments and support publication of the revised manuscript as a Discovery Note in Biology Direct.
long terminal repeat
open reading frame
terminal inverted repeat
expressed sequence tag
miniature inverted-repeat transposable element
target site duplication
transcription factor binding site
We thank Dr. Fengfeng Zhou (University of Georgia) for the help in the MITE analysis. We also thank Dr. Mikhail V. Matz (University of Texas at Austin) for useful comments on the early draft of this manuscript.
- Srivastava M, Begovic E, Chapman J, Putnam NH, Hellsten U, Kawashima T, Kuo A, Mitros T, Salamov A, Carpenter ML, Signorovitch AY, Moreno MA, Kamm K, Grimwood J, Schmutz J, Shapiro H, Grigoriev IV, Buss LW, Schierwater B, Dellaporta SL, Rokhsar DS: The Trichoplax genome and the nature of placozoans. Nature. 2008, 454: 955-960. 10.1038/nature07191.PubMedView ArticleGoogle Scholar
- Schierwater B, Eitel M, Jakob W, Osigus H-J, Hadrys H, Dellaporta SL, Kolokotronis S-O, DeSalle R: Concatenated analysis sheds light on early metazoan evolution and fuels a modern "Urmetazoon" hypothesis. PloS Biol. 2009, 7: e1000020-10.1371/journal.pbio.1000020.PubMed CentralView ArticleGoogle Scholar
- Dunn CW, Hejnol A, Matus DQ, Pang K, Browne WE, Smith SA, Seaver E, Rouse GW, Obst M, Edgecombe GD, Sorensen MV, Haddock SHD, Schmidt-Rhaesa A, Okusu A, Kristensen RM, Wheeler WC, Martindale MQ, Giribet G: Broad phylogenomic sampling improves resolution of the animal tree of life. Nature. 2008, 452: 745-749. 10.1038/nature06614.PubMedView ArticleGoogle Scholar
- Schierwater B: My favorite animal, Trichoplax adhaerens. BioEssays. 2005, 27: 1294-1302. 10.1002/bies.20320.PubMedView ArticleGoogle Scholar
- Signorovitch AY, Dellaporta SL, Buss LW: Molecular signatures for sex in the Placozoa. Proc Natl Acad Sci USA. 2005, 102: 15518-15522. 10.1073/pnas.0504031102.PubMedPubMed CentralView ArticleGoogle Scholar
- Kazazian HH: Mobile elements: drivers of genome evolution. Science. 2004, 303: 1626-1632. 10.1126/science.1089670.PubMedView ArticleGoogle Scholar
- Wessler SR: Eukaryotic transposable elements: teaching old genomes new tricks. The Implicit Genome. Edited by: Caporale L. 2006, New York: Oxford University Press, 138-165.Google Scholar
- Jurka J, Kapitonov VV, Kohany O, Jurka MV: Repetitive sequences in complex genomes: structure and evolution. Annu RevGenomics Hum Genet. 2007, 8: 241-259. 10.1146/annurev.genom.8.080706.092416.View ArticleGoogle Scholar
- Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J: Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005, 110: 462-467. 10.1159/000084979.PubMedView ArticleGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.PubMedView ArticleGoogle Scholar
- Xu Z, Wang H: LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007, 35: W265-W268. 10.1093/nar/gkm286.PubMedPubMed CentralView ArticleGoogle Scholar
- Feschotte C, Zhang X, Wessler SR: Miniature inverted-repeat transposable elements and their relationship with established DNA transposons. Mobile DNA II. Edited by: Craig N, Craigie R, Gellert M, Lambowitz A. 2002, WA: American Society of Microbiology Press, 1147-1158.View ArticleGoogle Scholar
- Chen Y, Zhou F, Li G, Xu Y: MUST: A system for identification of miniature inverted-repeat transposable elements and applications to Anabaena variabilis and Haloquadratum walsbyi. Gene. 2009, 436: 1-7. 10.1016/j.gene.2009.01.019.PubMedView ArticleGoogle Scholar
- Jurka J: DNA transposons from Trichoplax adhaerens. Repbase Rep. 2009, 9: 2144-2148.Google Scholar
- Malik H, Eickbush TH: Modular evolution of the integrase domain in the Ty3/Gypsy class of LTR retrotransposons. J Virol. 1999, 73: 5186-5190.PubMedPubMed CentralGoogle Scholar
- Bae Y-A, Moon S-Y, Kong Y, Cho S-Y, Rhyu M-G: CsRn1, a novel active retrotransposon in a parasitic trematode, Clonorchis sinensis, discloses a new phylogenetic clade of Ty3/gypsy-like LTR retrotransposons. Mol Biol Evol. 2001, 18: 1474-1483.PubMedView ArticleGoogle Scholar
- Kofler R, Schlotterer C, Lelley T: SciRoKo: a new tool for whole genome microsatellite search and investigation. Bioinformatics. 2007, 23: 1683-1685. 10.1093/bioinformatics/btm157.PubMedView ArticleGoogle Scholar
- Toth G, Gaspari Z, Jurka J: Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 2000, 10: 967-981. 10.1101/gr.10.7.967.PubMedPubMed CentralView ArticleGoogle Scholar
- Li Y-C, Korol AB, Fahima T, Beiles A, Nevo E: Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Mol Ecol. 2002, 11: 2453-2465. 10.1046/j.1365-294X.2002.01643.x.PubMedView ArticleGoogle Scholar
- Ellegren H: Microsatellites: simple sequences with complex evolution. Nat Rev Genet. 2004, 5: 435-445. 10.1038/nrg1348.PubMedView ArticleGoogle Scholar
- Sharma PC, Grover A, Kahl G: Mining microsatellites in eukaryotic genomes. Trends Biotechnol. 2007, 25: 490-498. 10.1016/j.tibtech.2007.07.013.PubMedView ArticleGoogle Scholar
- Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999, 27: 573-580. 10.1093/nar/27.2.573.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang L, Chen C, Cheng J, Wang S, Hu X, Hu J, Bao Z: Initial analysis of tandemly repetitive sequences in the genome of Zhikong scallop (Chlamys farreri Jones et Preston). DNA Seq. 2008, 19: 195-205. 10.1080/10425170701462316.PubMedView ArticleGoogle Scholar
- The C. elegans Sequencing Consortium: Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. 1998, 282: 2012-2018. 10.1126/science.282.5396.2012.View ArticleGoogle Scholar
- International Human Genome Sequencing Consortium: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.View ArticleGoogle Scholar
- Adams MD, et al: The genome sequence of Drosophila melanogaster. Science. 2000, 287: 2185-2195. 10.1126/science.287.5461.2185.PubMedView ArticleGoogle Scholar
- Katti MV, Ranjekar PK, Gupta V: Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol. 2001, 18: 1161-1167.PubMedView ArticleGoogle Scholar
- Smith CD, Shu S, Mungall CJ, Karpen GH: The release 5.1 annotation of Drosophila melanogaster heterochromatin. Science. 2007, 316: 1586-1591. 10.1126/science.1139815.PubMedPubMed CentralView ArticleGoogle Scholar
- Crollius HR, Jaillon O, Dasilva C, Ozouf-Costaz C, Fizames C, Fischer C, Bouneau L, Billault A, Quetier F, Saurin W, Bernot A, Weissenbach J: Characterization and repeat analysis of the compact genome of the freshwater pufferfish Tetraodon nigroviridis. Genome Res. 2000, 10: 939-949. 10.1101/gr.10.7.939.PubMed CentralView ArticleGoogle Scholar
- Bouneau L, Fischer C, Ozouf-Costaz C, Froschauer A, Jaillon O, Coutanceau J-P, Korting C, Weissenbach J, Bernot A, Volff J-N: An active non-LTR retrotransposon with tandem structure in the compact genome of the pufferfish Tetraodon nigroviridis. Genome Res. 2003, 13: 1686-1695. 10.1101/gr.726003.PubMedPubMed CentralView ArticleGoogle Scholar
- Jaillon O, et al: Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature. 2004, 431: 946-957. 10.1038/nature03025.PubMedView ArticleGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weigh matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680. 10.1093/nar/22.22.4673.PubMedPubMed CentralView ArticleGoogle Scholar
- Huelsenbeck JP, Ronquist F: Bayesian inference of phylogenetic trees. Bioinformatics. 2001, 17: 754-755. 10.1093/bioinformatics/17.8.754.PubMedView ArticleGoogle Scholar
- Whelan S, Goldman N: A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol. 2001, 18: 691-699.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.