TY - JOUR
T1 - Species-level para- and polyphyly in DNA barcode gene trees
T2 - strong operational bias in European Lepidoptera
AU - Mutanen, Marko
AU - Kivelä, Sami M.
AU - Vos, Rutger A.
AU - Doorenweerd, Camiel
AU - Ratnasingham, Sujeevan
AU - Hausmann, Axel
AU - Huemer, Peter
AU - Dincă, Vlad
AU - van Nieukerken, Erik J.
AU - Lopez-Vaamonde, Carlos
AU - Vila, Roger
AU - Aarvik, Leif
AU - Decaëns, Thibaud
AU - Efetov, Konstantin A.
AU - Hebert, Paul D. N.
AU - Johnsen, Arild
AU - Karsholt, Ole
AU - Pentinsaari, Mikko
AU - Rougerie, Rodolphe
AU - Segerer, Andreas
AU - Tarmann, Gerhard
AU - Zahiri, Reza
AU - Godfray, H. Charles J.
N1 - © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
PY - 2016/11
Y1 - 2016/11
N2 - The proliferation of DNA data is revolutionizing all fields of systematic research. DNA barcode sequences, now available for millions of specimens and several hundred thousand species, are increasingly used in algorithmic species delimitations. This is complicated by occasional incongruences between species and gene genealogies, as indicated by situations where conspecific individuals do not form a monophyletic cluster in a gene tree. In two previous reviews, nonmonophyly has been reported as being common in mitochondrial DNA gene trees. We developed a novel web service "Monophylizer" to detect non-monophyly in phylogenetic trees and used it to ascertain the incidence of species nonmonophyly in COI (a.k.a. cox1) barcode sequence data from 4977 species and 41,583 specimens of European Lepidoptera, the largest data set ofDNAbarcodes analyzed fromthis regard. Particular attentionwas paid to accurate species identification to ensure data integrity.We investigated the effects of tree-building method, sampling effort, and other methodological issues, all of which can influence estimates of non-monophyly.We found a 12% incidence of non-monophyly, a value significantly lower than that observed in previous studies.Neighbor joining (NJ) and maximum likelihood (ML) methods yielded almost equal numbers of non-monophyletic species, but 24.1% of these cases of non-monophyly were only found by one of these methods. Non-monophyletic species tend to show either low genetic distances to their nearest neighbors or exceptionally high levels of intraspecific variability. Cases of polyphyly in COI trees arising as a result of deep intraspecific divergence are negligible, as the detected cases reflected misidentifications or methodological errors. Taking into consideration variation in sampling effort, we estimate that the true incidence of non-monophyly is ~23%, but with operational factors still being included. Within the operational factors, we separately assessed the frequency of taxonomic limitations (presence of overlooked cryptic and oversplit species) and identification uncertainties. We observed that operational factors are potentially present in more than half (58.6%) of the detected cases of non-monophyly. Furthermore,we observed that in about 20% of non-monophyletic species and entangled species, the lineages involved are either allopatric or parapatric - conditions where species delimitation is inherently subjective and particularly dependent on the species concept that has been adopted. These observations suggest that species-level non-monophyly in COI gene trees is less common than previously supposed, with many cases reflecting misidentifications, the subjectivity of species delimitation or other operational factors. [DNA barcoding; gene tree; Lepidoptera; mitochondrial COI; mitochondrial cox1; paraphyly; polyphyly; species delimitation; species monophyly.]
AB - The proliferation of DNA data is revolutionizing all fields of systematic research. DNA barcode sequences, now available for millions of specimens and several hundred thousand species, are increasingly used in algorithmic species delimitations. This is complicated by occasional incongruences between species and gene genealogies, as indicated by situations where conspecific individuals do not form a monophyletic cluster in a gene tree. In two previous reviews, nonmonophyly has been reported as being common in mitochondrial DNA gene trees. We developed a novel web service "Monophylizer" to detect non-monophyly in phylogenetic trees and used it to ascertain the incidence of species nonmonophyly in COI (a.k.a. cox1) barcode sequence data from 4977 species and 41,583 specimens of European Lepidoptera, the largest data set ofDNAbarcodes analyzed fromthis regard. Particular attentionwas paid to accurate species identification to ensure data integrity.We investigated the effects of tree-building method, sampling effort, and other methodological issues, all of which can influence estimates of non-monophyly.We found a 12% incidence of non-monophyly, a value significantly lower than that observed in previous studies.Neighbor joining (NJ) and maximum likelihood (ML) methods yielded almost equal numbers of non-monophyletic species, but 24.1% of these cases of non-monophyly were only found by one of these methods. Non-monophyletic species tend to show either low genetic distances to their nearest neighbors or exceptionally high levels of intraspecific variability. Cases of polyphyly in COI trees arising as a result of deep intraspecific divergence are negligible, as the detected cases reflected misidentifications or methodological errors. Taking into consideration variation in sampling effort, we estimate that the true incidence of non-monophyly is ~23%, but with operational factors still being included. Within the operational factors, we separately assessed the frequency of taxonomic limitations (presence of overlooked cryptic and oversplit species) and identification uncertainties. We observed that operational factors are potentially present in more than half (58.6%) of the detected cases of non-monophyly. Furthermore,we observed that in about 20% of non-monophyletic species and entangled species, the lineages involved are either allopatric or parapatric - conditions where species delimitation is inherently subjective and particularly dependent on the species concept that has been adopted. These observations suggest that species-level non-monophyly in COI gene trees is less common than previously supposed, with many cases reflecting misidentifications, the subjectivity of species delimitation or other operational factors. [DNA barcoding; gene tree; Lepidoptera; mitochondrial COI; mitochondrial cox1; paraphyly; polyphyly; species delimitation; species monophyly.]
U2 - 10.1093/sysbio/syw044
DO - 10.1093/sysbio/syw044
M3 - Journal article
C2 - 27288478
SN - 1063-5157
VL - 65
SP - 1024
EP - 1040
JO - Systematic Biology
JF - Systematic Biology
IS - 6
ER -