Predicting and interpreting large scale mutagenesis data using analyses of protein stability and conservation

  • Magnus H. Høie (Ophavsmand)
  • Matteo Cagiada (Ophavsmand)
  • Anders Haagen Beck Frederiksen (Ophavsmand)
  • Amelie Stein (Ophavsmand)
  • Kresten Lindorff-Larsen (Ophavsmand)

Data set

Beskrivelse

Python code, Jupyter Notebooks and data for reproducing the work of the scientific paper "Predicting and interpreting large scale mutagenesis data using analyses of protein stability and conservation" by Magnus H. Høie, M. Cagiada, A. H. B. Frederiksen, A. Stein, and K. Lindorff-Larsen. Layout: - figures.ipynb: Jupyter notebook to generate paper figures - supp.ipynb: Jupyter notebook for supplemental paper figures - scripts.py: Python code used for generating figures - figures/: Folder containing saved figures in PDF - data.zip: Processed and raw MAVE variant data used for training machine-learning models and analysis - train.sh: BASH script for training and predicting with leave-one-protein out random forest models - src/: Python code used for pre-processing of datasets and training of machine-learning models - environment.yml: Conda environment with required library dependencies for running code Please refer to the README file for further details. Python code and Jupyter notebooks are also available on GitHub at: https://github.com/KULL-Centre/papers/tree/main/2021/ML-variants-Hoie-et-al
Dato for tilgængelighed2021
ForlagZenodo

Citationsformater