Abstract
Summary: To enable mass spectrometry (MS)-based proteomic studies with poorly characterized organisms, we developed a computational workflow for the homology-driven assembly of a non-redundant reference sequence dataset. In the automated pipeline, translated DNA sequences (e.g. ESTs, RNA deep-sequencing data) are aligned to those of a closely related and fully sequenced organism. Representative sequences are derived from each cluster and joined, resulting in a non-redundant reference set representing the maximal available amino acid sequence information for each protein. We here applied NOmESS to assemble a reference database for the widely used model organism Xenopus laevis and demonstrate its use in proteomic applications.
Original language | English |
---|---|
Journal | Bioinformatics (Online) |
Volume | 32 |
Issue number | 9 |
Pages (from-to) | 1417-9 |
Number of pages | 3 |
ISSN | 1367-4811 |
DOIs | |
Publication status | Published - 1 May 2016 |
Externally published | Yes |
Keywords
- Amino Acid Sequence
- Animals
- Base Sequence
- High-Throughput Nucleotide Sequencing
- Humans
- Mass Spectrometry
- Proteomics
- Journal Article