Abstract
The era of high-throughput sequencing has made it relatively simple to sequence genomes and transcriptomes of individuals from many species. In order to analyze the resulting sequencing data, high-quality reference genome assemblies are required. However, this is still a major challenge, and many domesticated animal genomes still need to be sequenced deeper in order to produce high-quality assemblies. In the meanwhile, ironically, the extent to which RNAseq and other next-generation data is produced frequently far exceeds that of the genomic sequence. Furthermore, basic comparative analysis is often affected by the lack of genomic sequence. Herein, we quantify the quality of the genome assemblies of 20 domesticated animals and related species by assessing a range of measurable parameters, and we show that there is a positive correlation between the fraction of mappable reads from RNAseq data and genome assembly quality. We rank the genomes by their assembly quality and discuss the implications for genotype analyses.
Originalsprog | Engelsk |
---|---|
Tidsskrift | Bioinformatics and Biology Insights |
Vol/bind | 9 |
Udgave nummer | Suppl 4 |
Sider (fra-til) | 49-58 |
Antal sider | 10 |
ISSN | 1177-9322 |
DOI | |
Status | Udgivet - 1 jan. 2015 |