TY - JOUR
T1 - eggNOG 5.0
T2 - a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses
AU - Huerta-Cepas, Jaime
AU - Szklarczyk, Damian
AU - Heller, Davide
AU - Hernández-Plaza, Ana
AU - Forslund, Sofia K
AU - Cook, Helen
AU - Mende, Daniel R
AU - Letunic, Ivica
AU - Rattei, Thomas
AU - Jensen, Lars J
AU - von Mering, Christian
AU - Bork, Peer
PY - 2019/1/8
Y1 - 2019/1/8
N2 - eggNOG is a public database of orthology relationships, gene evolutionary histories and functional annotations. Here, we present version 5.0, featuring a major update of the underlying genome sets, which have been expanded to 4445 representative bacteria and 168 archaea derived from 25 038 genomes, as well as 477 eukaryotic organisms and 2502 viral proteomes that were selected for diversity and filtered by genome quality. In total, 4.4M orthologous groups (OGs) distributed across 379 taxonomic levels were computed together with their associated sequence alignments, phylogenies, HMM models and functional descriptors. Precomputed evolutionary analysis provides fine-grained resolution of duplication/speciation events within each OG. Our benchmarks show that, despite doubling the amount of genomes, the quality of orthology assignments and functional annotations (80% coverage) has persisted without significant changes across this update. Finally, we improved eggNOG online services for fast functional annotation and orthology prediction of custom genomics or metagenomics datasets. All precomputed data are publicly available for downloading or via API queries at http://eggnog.embl.de.
AB - eggNOG is a public database of orthology relationships, gene evolutionary histories and functional annotations. Here, we present version 5.0, featuring a major update of the underlying genome sets, which have been expanded to 4445 representative bacteria and 168 archaea derived from 25 038 genomes, as well as 477 eukaryotic organisms and 2502 viral proteomes that were selected for diversity and filtered by genome quality. In total, 4.4M orthologous groups (OGs) distributed across 379 taxonomic levels were computed together with their associated sequence alignments, phylogenies, HMM models and functional descriptors. Precomputed evolutionary analysis provides fine-grained resolution of duplication/speciation events within each OG. Our benchmarks show that, despite doubling the amount of genomes, the quality of orthology assignments and functional annotations (80% coverage) has persisted without significant changes across this update. Finally, we improved eggNOG online services for fast functional annotation and orthology prediction of custom genomics or metagenomics datasets. All precomputed data are publicly available for downloading or via API queries at http://eggnog.embl.de.
U2 - 10.1093/nar/gky1085
DO - 10.1093/nar/gky1085
M3 - Journal article
C2 - 30418610
SN - 0305-1048
VL - 47
SP - D309-D314
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - D1
ER -