TY - JOUR
T1 - Contaminating viral sequences in high-throughput sequencing viromics
T2 - a linkage study of 700 sequencing libraries
AU - Asplund, Maria
AU - Kjartansdóttir, Kristín Rós
AU - Mollerup, Sarah
AU - Vinner, Lasse
AU - Fridholm, Helena
AU - Herrera, José A R
AU - Friis-Nielsen, Jens
AU - Hansen, Thomas Arn
AU - Jensen, Randi Holm
AU - Nielsen, Ida Broman
AU - Richter, Stine Raith
AU - Rey-Iglesia, Alba
AU - Matey-Hernandez, Maria Luisa
AU - Alquezar-Planas, David E
AU - Olsen, Pernille V S
AU - Sicheritz-Pontén, Thomas
AU - Willerslev, Eske
AU - Lund, Ole
AU - Brunak, Søren
AU - Mourier, Tobias
AU - Nielsen, Lars Peter
AU - Izarzugaza, Jose M G
AU - Hansen, Anders Johannes
PY - 2019/10
Y1 - 2019/10
N2 - Objectives: Sample preparation for high-throughput sequencing (HTS) includes treatment with various laboratory components, potentially carrying viral nucleic acids, the extent of which has not been thoroughly investigated. Our aim was to systematically examine a diverse repertoire of laboratory components used to prepare samples for HTS in order to identify contaminating viral sequences. Methods: A total of 322 samples of mainly human origin were analysed using eight protocols, applying a wide variety of laboratory components. Several samples (60% of human specimens) were processed using different protocols. In total, 712 sequencing libraries were investigated for viral sequence contamination. Results: Among sequences showing similarity to viruses, 493 were significantly associated with the use of laboratory components. Each of these viral sequences had sporadic appearance, only being identified in a subset of the samples treated with the linked laboratory component, and some were not identified in the non-template control samples. Remarkably, more than 65% of all viral sequences identified were within viral clusters linked to the use of laboratory components. Conclusions: We show that high prevalence of contaminating viral sequences can be expected in HTS-based virome data and provide an extensive list of novel contaminating viral sequences that can be used for evaluation of viral findings in future virome and metagenome studies. Moreover, we show that detection can be problematic due to stochastic appearance and limited non-template controls. Although the exact origin of these viral sequences requires further research, our results support laboratory-component-linked viral sequence contamination of both biological and synthetic origin.
AB - Objectives: Sample preparation for high-throughput sequencing (HTS) includes treatment with various laboratory components, potentially carrying viral nucleic acids, the extent of which has not been thoroughly investigated. Our aim was to systematically examine a diverse repertoire of laboratory components used to prepare samples for HTS in order to identify contaminating viral sequences. Methods: A total of 322 samples of mainly human origin were analysed using eight protocols, applying a wide variety of laboratory components. Several samples (60% of human specimens) were processed using different protocols. In total, 712 sequencing libraries were investigated for viral sequence contamination. Results: Among sequences showing similarity to viruses, 493 were significantly associated with the use of laboratory components. Each of these viral sequences had sporadic appearance, only being identified in a subset of the samples treated with the linked laboratory component, and some were not identified in the non-template control samples. Remarkably, more than 65% of all viral sequences identified were within viral clusters linked to the use of laboratory components. Conclusions: We show that high prevalence of contaminating viral sequences can be expected in HTS-based virome data and provide an extensive list of novel contaminating viral sequences that can be used for evaluation of viral findings in future virome and metagenome studies. Moreover, we show that detection can be problematic due to stochastic appearance and limited non-template controls. Although the exact origin of these viral sequences requires further research, our results support laboratory-component-linked viral sequence contamination of both biological and synthetic origin.
U2 - 10.1016/j.cmi.2019.04.028
DO - 10.1016/j.cmi.2019.04.028
M3 - Journal article
C2 - 31059795
SN - 1198-743X
VL - 25
SP - 1277
EP - 1285
JO - Clinical Microbiology and Infection
JF - Clinical Microbiology and Infection
IS - 10
ER -