Data from: Enamel proteins reveal biological sex and genetic variability within southern African Paranthropus

Dataset

Description

This dataset contains the sequences of Paranthropus robustus, first described in 'Enamel proteins reveal biological sex and genetic variability within southern African Paranthropus', as well as the reference data and all the results from the analysis of those sequences. Folders and Sub-Folders: - Paranthropus_Raw_AA_Sequences_Unaligned: Contains 2 fasta files. Paranthropus_Unaligned.fasta contains all the Paranthropus robustus sequences that were used for all of the analyses. Paranthropus_Unaligned_UNFILTERED.fasta contains all the Paranthropus robusts sequences before filtering for SAP quality/confidence. These sequences were not used in any of the analyses, but are provided here for openness. - Reference_Datasets: Contains 3 fasta files. Each fasta file is a reference dataset used in at least one analysis. The identity and origin of each sample is described in the supplementary document of the publication. - Phylogenetic_Analysis_Datasets_and_Trees: Contains the following 4 folders - Paranthropus_Alignments_All_Datasets: Contains 3 folders. Each folder contains the aligned and I/L corrected MSAs (Multiple Sequence Alignments) of Paranthropus robustus and a reference dataset. - Paranthropus_Diversity_Dataset_Trees_Results: Contains all analysis done using the 'diversity' reference dataset. Contains one folder for each protein, which includes the protein alignment and the phylogenetic tree of that protein. Additionally a folder named 'CONCATENATED' contains the concatenated alignemnts and trees. The BEAST2-STARBEAST3 folder contains the Starbeast3 analysis, including the xml, output log file, output trees and the input taxon set file. - Paranthropus_Representative_Dataset_Trees_Results: Contains all analysis done using the 'representative' reference dataset. Contains one folder for each protein, which includes the protein alignment and the phylogenetic tree of that protein. Additionally a folder named 'CONCATENATED' contains the concatenated alignemnts and trees. The BEAST2 folder contains the time-calibrated BEAST2 analysis, including the xml, output log file, output trees. The folder Distance_Matrix contains the generated distance matrix and the Rscript used to generate the heatmap from it. - Paranthropus_Independent_Dataset_Trees_Results: Contains all nexus files and tree-figures used in the analysis of the 'independent' reference dataset.
Date made available2024
PublisherZenodo

Cite this