CRISPR-Cas-Docker

Help for using CRISPR-Cas-Docker (CCD)

The CCD server docks a CRISPR RNA (crRNA) onto its Cas protein in silico and generates Top 10 docking models using HDOCK. This bioinformatic tool will be useful when your prokaryotic genomes have multiple CRISPR arrays and Cas systems, so optimal crRNA-Cas protein pairs are not clear. As a preliminary study, CCD can predict the RNA-protein interaction of a given crRNA-Cas pair through in silico experiments, before conducting any time-consuming and expensive in vitro and in vivo experiments. You can either provide experimental 3D structures of your crRNA and Cas protein directly or use 3D-predicted crRNA and AlphaFold-predicted Cas proteins.

When you have no Cas protein and crRNA experimental structures:

1. Have your Cas protein sequence ready

You may have a Cas gene cluster from your sequencing experiment or from a public database like CRISPRCasdb (https://crisprcas.i2bc.paris-saclay.fr/MainDb/StrainList). For now, the CCD can only handle the 3D structure prediction of Class II effector proteins such as Cas9, Cas12, and Cas13 that are more widely used as genomic tools. There are plans to upgrade the CCD to handle multimeric Cas proteins of Class I.

As an example, you may want to predict the RNA-protein docking from a Cas gene cluster (CAS-TypeII-C) and its neighboring CRISPR array (CP065677_3). This genome is from the bacteria (Clostridium perfringens) uploaded in the CRISPRCasdb with the following information:

Fig.1 - Image capture from the CRISPRCasdb on 2022.12.11

After extracting the DNA sequence of your Cas gene, have it translated into a protein sequence (using a bioinformatic translation tool, such as https://www.bioinformatics.org/sms2/translate.html). Please pay attention to the reading frame and the strand direction. Also, please remove the * at the end of the protein, if present.

For the Cas9 protein from Clostridium perfringens, the protein sequence below will be an input to generate its 3D protein structure using AlphaFold.

Fig.2 - Image capture from the sms2 bioinformatic tool on 2022.12.11

2. Have your crRNA sequence ready

Next, extract your crRNA sequence (either in RNA or DNA) from your sequencing experiment or from a public database like CRISPRCasdb (https://crisprcas.i2bc.paris-saclay.fr/MainDb/StrainList).

For the example from Clostridium perfringens above, you can extract the consensus DR sequence from the neighboring CRISPR array (CP065677_3) of your Cas gene cluster.

Fig.3 - Image capture from the CRISPRCasdb on 2022.12.11

3. Provide Cas protein sequence as an input to CCD

The next step requires you to provide an amino acid sequence of your Cas protein to predict its 3D structure using AlphaFold. You can either directly input an amino acid sequence or upload a FASTA file containing a single amino acid sequence in the standard format.

4. Provide crRNA sequence as an input to CCD

The final step requires you to provide an RNA or DNA sequence of your crRNA to predict its 3D structure using RNAFold and RoseTTAFold. You can either directly input an RNA or DNA sequence or upload a FASTA file containing a single RNA or DNA sequence in the standard format.

5. Fill in your query information and press [send]

For your Cas protein, it may take 30 minutes to run AlphaFold to predict its 3D structure. For your crRNA, it may take 15 minutes to run RNAFold and RoseTTAFold to predict its 3D structure. For your crRNA and Cas protein pair, it may take 15 minutes to generate the Top 10 docking models from HDOCK docking experiments.

When you have Cas protein and crRNA experimental structures:

1. Provide Cas protein structure as an input to CCD

The first step requires you to provide the 3D structure of your Cas protein. You can either directly input a PDB ID of your Cas protein from the RCSB Protein Data Bank (https://www.rcsb.org/) or upload a PDB file containing a single protein structure in the standard format.

2. Provide crRNA structure as an input to CCD

The next step requires you to provide the 3D structure of your crRNA in the standard PDB format.

3. Fill in your query information and press [send]

For your crRNA and Cas protein pair, it may take 15 minutes to generate the Top 10 docking models from HDOCK docking experiments.