How to run AlphaFold and RoseTTAFold on CCI
Get an account with PMAR project, set up the login https://github.rpi.edu/RPIBioinformatics/PMAR/blob/main/CCIForms.md
-
ssh PMARxxxx@blp02.ccni.rpi.edu, with google authenticator
-
ssh nplfen01 to access NPL GPUs. (check https://docs.cci.rpi.edu/status/ for all cluster utilization)
-
create or upload your fasta file to nplfen01 (eg. T1088.fasta)
-
If you have modified .bashrc for our first version of alphafold setup, you will need to remove the alphafold setup from .bashrc (example included):
* Edit .bashrc, everything below "# User specific aliases and functions" for alphafold can be deleted. * Log out and back in, now you will have a clean environment. I am also assuming everything was added below that comment, as that is the default last line of the .bashrc file.
example: https://github.rpi.edu/RPIBioinformatics/PMAR/blob/main/.bashrc
-
create two files: the fasta file (eg. T1088.fasta) and also the sbatch script (eg. run_T1088_af2.sh for alphfold, run_T1088_rf.sh for rosettafold)
- To run alphafold: At the beginning of your scripts (eg. run_T1088_af2.sh) you can add the below:
source /gpfs/u/barn/PMAR/shared/etc/212_alphaFOLD
- To run rottaafold: At the beginning of your scripts (eg. run_T1088_rf.sh) you can add the below:
source /gpfs/u/barn/PMAR/shared/etc/rosettaFOLD
-
type command:
sbatch run_T1088_af2.sh or sbatch run_T1088_rf.sh
example:
https://github.rpi.edu/RPIBioinformatics/PMAR/blob/main/run_T1088_af2.sh https://github.rpi.edu/RPIBioinformatics/PMAR/blob/main/run_T1088_rf.sh
For alphafold example: the output is located in ~/scratch-shared/alphfoldout/PMARhnyn/T1088. I created the PMARhnyn folder and all my alphafold output will be under ~/scratch-shared/alphfoldout/PMARhnyn. You will need to create your own output folder under ~/scratch-shared/alphafoldout, or set -o to other places.
For rosettafold example: the output is located in the home directory called T1088rf_out. You may also put the results under ~/scratch-shared.
AlphaFold for non-docker version is installed on CCI:
Github site for docker version: https://github.com/deepmind/alphafold
when you run run_alphafold.sh with no args you get:
Please make sure all required parameters are given
Usage: /gpfs/u/home/PMAR/PMARrbtj/barn-shared/alphafold/run_alphafold.sh <OPTIONS>
Required Parameters:
-d <data_dir> Path to directory of supporting data
-o <output_dir> Path to a directory that will store the results.
-f <fasta_path> Path to a FASTA file containing sequence. If a FASTA file contains multiple sequences, then it will be folded as a multimer
-t <max_template_date> Maximum template release date to consider (ISO-8601 format - i.e. YYYY-MM-DD). Important if folding historical test sets
Optional Parameters:
-g <use_gpu> Enable NVIDIA runtime to run with GPUs (default: true)
-n <openmm_threads> OpenMM threads (default: all available cores)
-a <gpu_devices> Comma separated list of devices to pass to 'CUDA_VISIBLE_DEVICES' (default: 0)
-m <model_preset> Choose preset model configuration - the monomer model, the monomer model with extra ensembling, monomer model with pTM head, or multimer model (default: 'monomer')
-c <db_preset> Choose preset MSA database configuration - smaller genetic database config (reduced_dbs) or full genetic database config (full_dbs) (default: 'full_dbs')
-p <use_precomputed_msas> Whether to read MSAs that have been written to disk. WARNING: This will not check if the sequence, database or configuration have changed (default: 'false')
-l <is_prokaryote> Optional for multimer system, not used by the single chain system. A boolean specifying true where the target complex is from a prokaryote, and false where it is not, or where the origin is unknown. This value determine the pairing method for the MSA (default: 'None')
-b <benchmark> Run multiple JAX model evaluations to obtain a timing that excludes the compilation time, which should be more indicative of the time required for inferencing many proteins (default: 'false')
RoseTTAFold
https://github.com/RosettaCommons/RoseTTAFold
AlphaFold Colab (free service from google)
A slightly simplified version of AlphaFold v2.1.0, uses no templates and a selected portion of the BFD database
https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb