Elucidating aquatic virioplankton diversity and dynamics using high-throughput DNA sequencing

Date
2015
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
Viral infection and lysis are important processes contributing to the diversification and evolution of microbial communities. In marine ecosystems that cover 70% of the earth’s surface, there are ~10 million viruses per milliliter of seawater, indicating an incredible diversity of viruses waiting to be explored. An important breakthrough in viral ecology was the application of high-throughput DNA sequencing to entire viral communities. Known as shotgun viral metagenomics, this approach allows access to the majority of viruses that cannot be maintained in culture. Viral metagenomics has revealed surprising insight into ancient associations between viruses and hosts. However, making quantitative inferences from next generation sequencing data requires careful evaluation of viral isolation and DNA library preparation techniques. Many library preparation strategies employ some form of amplification to obtain sufficient DNA for sequencing. Biases resulting from these techniques may alter interpretations of the occurrence and abundance of viral populations from metagenome data. In turn, these biases may lead to misinterpretations of viral population dynamics. Recognizing the importance of library construction for metagenomic studies, the biases of two commonly used technologies for preparing viral DNA for sequencing were evaluated. The first technology evaluated, Nextera™, uses a transposase to fragment genomic DNA followed by limited-cycle PCR to amplify fragmented DNA while simultaneously adding appropriate barcodes and sequencing primers. A library of a mock community comprised of genomic DNA from nine different viruses was prepared using the Nextera technology. Subsequent DNA sequencing revealed coverage-based biases over regions of low or high genomic GC content, likely due to the limited cycle PCR step. While the Nextera protocol may have skewed the distribution of low and high GC phage in mixed sample communities, the technology was sensitive enough to detect rare members in the mock viral community, and in some cases, complete genomes were successfully reconstructed. Obtaining sufficient amounts of DNA for sequencing is a common challenge in viral metagenomics. Many published techniques have used multiple displacement amplification (MDA), employing the phiX29 polymerase, to amplify genomic DNA to microgram quantities before proceeding with library preparation and sequencing. Despite documented biases of this technique, MDA has been commonly used with the assumption that pooling replicate MDA reactions of a single sample alleviates amplification bias. To test this assumption, viral metagenome libraries were constructed from a single mock viral community. The control (unamplified) library was compared to libraries prepared from a single and pooled MDA treatment. Sequence coverage of viral genomes was highly uneven in the MDA treatments compared to the unamplified control. Strikingly, coverage patterns for the single and pooled MDA samples were nearly identical, suggesting amplification biases are reproducible and likely sequence-dependent. Therefore, MDA should be avoided for any studies that aim to make quantitative inferences from mixed population samples. Shotgun viral metagenomic sequence libraries (viromes) have revealed a surprising diversity of novel and ancient genes within viral communities. Unlike cellular life, there is no single gene that is universally carried within all viral genomes. However, there are many genes that occur among a wide cross-section of viruses, including genes involved in nucleotide and protein metabolism. Chaperonins, a conserved protein-folding system found in all cellular life, were explored in viral metagenomic data. Contrary to the low frequency of chaperonin-encoding viruses in sequence databases, a surprising diversity and abundance of viral chaperonins were discovered within viromes representing a range of marine ecosystems. Viral chaperonins were shown to be evolutionarily ancient, and are likely indispensible for successful infection. Carrying the small cochaperonin gene, GroES, appeared to be the most common strategy for aquatic viruses. However, populations of large genome viruses carried complete GroES and EL operons, with the phylogeny of large-subunit GroEL genes matching the genomic context of the GroEL gene. Archaeal versions of chaperonins (i.e. thermosomes) were also discovered in viral metagenomes, revealing the presence of tailed archaeal viruses in surface seawaters. The phylogenetic resolution and conservation of GroEL and thermosome genes make them excellent targets for studying the dynamics of large-genome viruses in aquatic environments. Virioplankton communities turnover rapidly, less than a day in many cases. Thus, it is surprising that so few studies have examined the variability of viral and bacterial community diversity over diel cycles. To address this shortcoming, variations in bacterial and viral populations in replicate mesocosms were examined over a twenty-four hour period using marker gene sequencing. Distinct changes in relative population abundance, indicative of Kill-the-Winner predatory-prey type oscillations, were observed for bacterial and viral populations. Viral OTU distributions matched predictions based on the Bank model, with few abundant viral populations and a large fraction of rare seed-bank populations. The rank abundance of some virioplankton populations changed over the course of a few hours, likely representing r-selected, fast-growing viruses. In contrast, rare populations emerging from the bank were not observed. This short-term variability in the abundance of viral and host populations is not seen over longer temporal scales such as weeks and months, explaining the common observation of stable viral and microbial communities within marine ecosystems. The results of this study encourage future investigations incorporating more sampling time points over multiple diel cycles for identifying repeating and time-lagged associations between viral and bacterial populations.
Description
Keywords
Citation