Using the Molecular Dynamics (MD) simulation on the Cascade-crRNA-DNA complex and the Cas12a-crRNA-DNA complex, to compare the RMSF and the number of hydrogen bonds formed between protein and crRNA of different combinations. From the comparison, it was predicted that the stability at the equilibrium is at the highest when the DNA recognized by the Cas12a is the ds-template3. We assumed that the more stable the complex, the higher the CRISPR-Cas's ability to detect, and thus concluded that 3step-SDA is the most desirable amplification method. This result aligns with the conclusion regarding the preferable amplification method as discussed in Dry Lab_Model_Amplification_comparison.
We plan to use either CRISPR-Cas3 or CRISPR-Cas12a in the final stage of detecting the target miRNA. Additionally, there are several options for the crRNA, and there is flexibility in the number of multistep-SDA stages that can be inserted before connecting to CRISPR-Cas. Here, we will consider the most desirable options by varying the crRNA and the complementary DNA to the crRNA for both CRISPR-Cas3 and CRISPR-Cas12a.
The stability of the CRISPR-Cas3 complex will be evaluated using Molecular Dynamics (MD) simulation for the complex of crRNA, DNA, and Cascade, while the stability of the CRISPR-Cas12a complex will be evaluated for the crRNA-DNA-Cas12a complex. Table 1. shows the combinations of Protein, crRNA, and DNA used in the simulations. After connecting n step SDA and dsDNA elongation, we call the dsDNA recognized by CRISPR-Cas as ds-template.
\[ \begin{array}{|c|c|c|} \hline \textbf{Protein} & \textbf{crRNA} & \textbf{DNA} \\ \hline \text{Cascade} & \text{SARS-CoV-2 N2} & \text{Positive Control (PC)} \\ \hline \text{Cascade} & \text{SARS-CoV-2 N2} & \text{ds-template2} \\ \hline \text{Cas12a} & \text{EMX1 crRNA} & \text{ds-template0} \\ \hline \text{Cas12a} & \text{EMX1 crRNA} & \text{ds-template1} \\ \hline \text{Cas12a} & \text{EMX1 crRNA} & \text{ds-template2} \\ \hline \text{Cas12a} & \text{EMX1 crRNA} & \text{ds-template3} \\ \hline \text{Cas12a} & \text{SARS-CoV-2 N2 crRNA} & \text{Positive Control (PC)} \\ \hline \text{Cas12a} & \text{SARS-CoV-2 N2 crRNA} & \text{ds-template0} \\ \hline \text{Cas12a} & \text{SARS-CoV-2 N2 crRNA} & \text{ds-template1} \\ \hline \text{Cas12a} & \text{SARS-CoV-2 N2 crRNA} & \text{ds-template2} \\ \hline \text{Cas12a} & \text{SARS-CoV-2 N2 crRNA} & \text{ds-template3} \\ \hline \end{array} \]
Table 1. crRNA sequences for different proteins and species
The initial structure of MD simulation uses structure prediction from AlphaFold31. The amino acid sequences of Cascade and Cas12a are the same as the protein of PDB ID:5H9F and PDB ID:5ID6, respectively. The base sequences of crRNA and DNA are shown in Table 2. and 3., respectively.
In the MD simulation of Cascade and Cas12a, GROMACS and GENESIS were used, respectively 2, 3, 4, 5, 6, 7, 8, 9. The initial structure does not contain water. After the solvation, ionization, energy minimization, and equilibration with constant pressure and temperature of the protein, a full 2 ns molecular dynamics simulation was conducted.
Protein | Species | Sequence (5' to 3') |
---|---|---|
Cascade | SARS-CoV-2 N2 crRNA | GUGUUCCCCGCGCCAGCGGGGAUAAACCGUGCAAUUUGCG GCCAAUGUUUGUAAUCAGUUCGUGUUCCCCGCGCCAGCGGGGAUAAACCG |
Cas12a | EMX1 | AAUUUCUACUAAGUGUAGAUAAUGACUAGGGUGGGCAACCA |
Cas12a | SARS-CoV-2 N2 | AAUUUCUACUAAGUGUAGAUGAACGCUGAAGCGCUGGGGG |
Table 2. crRNA sequences for different proteins and species
Protein | Species | Name | Sequence (5' to 3') |
---|---|---|---|
Cascade | SARS-CoV-2 N2 | PC | AGGAACTAATCAGACAAGGAACTGATTACAAACATTGGCCGCAAATTGCACAATTTGCCC | GGGCAAATTGTGCAATTTGCGGCCAATGTTTGTAATCAGTTCCTTGTCTGATTAGTTCCT |
Cascade | SARS-CoV-2 N2 | ds-template2 | TTGAAGGAACTGATTACAAACATTGGCCGCAAATTGCATTCCTGCTGAACTGAGCCACCTCA | TGAGGTGGCTCAGTTCAGCAGGAATGCAATTTGCGGCCAATGTTTGTAATCAGTTCCTTCAA |
Cascade | SARS-CoV-2 N2 | ds-template3 | TTGAAGGAACTGATTACAAACATTGGCCGCAAATTGCAGCCCTGTACAATGCTGCTCCTCA | TGAGGAGCAGCATTGTACAGGGCTGCAATTTGCGGCCAATGTTTGTAATCAGTTCCTTCAA |
Cas12a | EMX1 | ds-template0 | CTTATCAGACTGATGTTGATTTGTGGTTGCCCACCCTAGTCATTCAATGACTAG | CTAGTCATTGAATGACTAGGGTGGGCAACCACAAATCAACATCAGTCTGATAAG |
Cas12a | EMX1 | ds-template1 | ACATTGTCTGCTGGGTTTCTTTGTGGTTGCCCACCCTAGTCATTCAATGACTAG | CTAGTCATTGAATGACTAGGGTGGGCAACCACAAAGAAACCCAGCAGACAATGT |
Cas12a | EMX1 | ds-template2 | TGGCTCAGTTCAGCAGGAATTTGTGGTTGCCCACCCTAGTCATTCAATGACTAG | CTAGTCATTGAATGACTAGGGTGGGCAACCACAAATTCCTGCTGAACTGAGCCA |
Cas12a | EMX1 | ds-template2 | TGGCTCAGTTCAGCAGGAATTTGTGGTTGCCCACCCTAGTCATTCAATGACTAG | CTAGTCATTGAATGACTAGGGTGGGCAACCACAAATTCCTGCTGAACTGAGCCA |
Cas12a | EMX1 | ds-template3 | AGCAGCATTGTACAGGGCTTTTGTGGTTGCCCACCCTAGTCATTCAATGACTAG | CTAGTCATTGAATGACTAGGGTGGGCAACCACAAAAGCCCTGTACAATGCTGCT |
Cas12a | SARS-CoV-2 N2 | PC | GCCGCAAATTGCACAATTTGCCCCCAGCGCTTCAGCGTTCTTCGGAATGTCGCAGCTTGG | CCAAGCTGCGACATTCCGAAGAACGCTGAAGCGCTGGGGGCAAATTGTGCAATTTGCGGC |
Cas12a | SARS-CoV-2 N2 | ds-template0 | CTTATCAGACTGATGTTGATTTGCCCCCAGCGCTTCAGCGTTCCAATGACTAG | CTAGTCATTGGAACGCTGAAGCGCTGGGGGCAAATCAACATCAGTCTGATAAG |
Cas12a | SARS-CoV-2 N2 | ds-template1 | ACATTGTCTGCTGGGTTTCTTTGCCCCCAGCGCTTCAGCGTTCCAATGACTAG | CTAGTCATTGGAACGCTGAAGCGCTGGGGGCAAAGAAACCCAGCAGACAATGT |
Cas12a | SARS-CoV-2 N2 | ds-template2 | TGGCTCAGTTCAGCAGGAATTTGCCCCCAGCGCTTCAGCGTTCCAATGACTAG | CTAGTCATTGGAACGCTGAAGCGCTGGGGGCAAATTCCTGCTGAACTGAGCCA |
Cas12a | SARS-CoV-2 N2 | ds-template3 | AGCAGCATTGTACAGGGCTTTTGCCCCCAGCGCTTCAGCGTTCCAATGACTAG | CTAGTCATTGGAACGCTGAAGCGCTGGGGGCAAAAGCCCTGTACAATGCTGCT |
Table 3. DNA Target and Primer Sequences
In Table 1. and Table 2., complementary binding crRNA and DNA sequences are underlined.
Root Mean Square Deviation (RMSD) represents the difference between two structures, the target structure and the initial structure. If the RMSD converges to a constant value, it can be considered that the system has reached equilibrium. After completing the full 2 ns MD simulation, the RMSD is calculated using MDAnalysis 10.
\begin{equation} \begin{aligned} \text{RMSD} = \sqrt{\frac{1}{N} \sum_{i=1}^{N} \left( \mathbf{r}_i(t) - \mathbf{r}_i(0) \right)^2} \notag\\ \end{aligned} \end{equation}Then, we conducted the MD simulation using the combinations in Table 1, and the obtained RMSD are shown in Figure 1, 2, and 3.
Figure 1.
Figure 2.
Figure 3.
As seen in Figure 1., the Cascade-crRNA-DNA complex does not show a flat line, particularly regarding PC, during the 2 ns simulation, indicating that the complex has not reached equilibrium. In contrast, many of the Cas12a-crRNA-DNA complexes exhibit a flat line within 2 ns, suggesting that they have reached equilibrium.
When analyzing larger structures, the time required for simulation is long. Cascade is a complex of multiple proteins, and due to its significantly larger structure compared to Cas12a, the 2 ns simulation time was insufficient for it to reach equilibrium. In the following analysis, we will focus on the Cas12a complexes, where many combinations show RMSD convergence, i.e. equilibration, within the 2 ns simulation time, to analyze the stability of the complex at equilibrium.
Root Mean Square Fluctuation (RMSF) calculates the deviation from the average structure and represents fluctuations from the equilibrium state. It allows us to identify fluctuating regions or residues, which can be used to evaluate flexibility and consider functionality. Here, RMSF is calculated to assess the stability of the complex in its equilibrium state. The RMSF of the i-th residue is defined by the following equation. When plotting the RMSF for different DNA sequences, the graph becomes difficult to interpret due to the large amount of noise, causing overlap between the lines. Therefore, a Gaussian filter was applied to remove the noise, making it easier to compare RMSF across different DNA sequences.
\begin{equation} \begin{aligned} \text{RMSF}_i = \sqrt{\frac{1}{T} \sum_{t=1}^{T} \left( \mathbf{r}_i(t) - \langle \mathbf{r}_i \rangle \right)^2} \notag\\ \end{aligned} \end{equation}Figure 4.
Figure 5.
To see the interactions between the protein and crRNA, we examined the time evolution of the number of hydrogen bonds formed between them 11. The criteria for a hydrogen bonds were defined as \(donor-acceptor\ distance < 3\text{\AA},\ donor-hydrogen-acceptor\ angle > 150^{\circ}\). The donor was considered to be the protein, and the acceptor was the nucleic acid. Since the number of hydrogen bonds also exhibits significant noise, a Gaussian filter was applied to remove the noise, making it easier to compare the number of hydrogen bonds across different DNA sequences.
Figure 6.
Figure 7.
A smaller RMSF value indicates smaller fluctuations from the equilibrium state, meaning the complex is more stable. Additionally, the more hydrogen bonds formed between Cas12a and crRNA, the stronger the interaction between them, leading to increased stability of the complex. Here, we evaluate DNA sequences where the complex stability is higher as superior.
For the combination of Cas12a and EMX1 crRNA, as shown in Figure 4., it was confirmed that the RMSF of DNA ds-template1 and ds-template2 is relatively large. Furthermore, Figure 6. shows that the number of hydrogen bonds is relatively low when the DNA is ds-template0 or ds-template2. These results suggest that the complex stability is highest when the DNA is ds-template3. In our Dry Lab, we predicted that when 3step-SDA is connected with dsDNA elongation, the signal amplification efficiency, as well as the robustness of the amplification efficiency in terms of time, enzyme, and S/N ratio, would be the highest (Dry lab_Model_Amplification_comparison). By conducting MD simulations and demonstrating that the stability of the Cas12a-crRNA-DNA complex is highest when the DNA is ds-template3, we were able to propose from a different approach that linking 3step-SDA with dsDNA elongation is the preferable amplification method.
On the other hand, as shown in Figure 5. and Figure 7., for the combination of Cas12a and SARS-CoV-2 N2 crRNA, there are no clear differences in terms of RMSF or the number of hydrogen bonds among ds-template0, ds-template1, ds-template2, and ds-template3. This indicates that there is no significant difference in complex stability among the four DNA sequences. Therefore, we concluded that the 3step-SDA, predicted to be superior in terms of amplification efficiency and robustness, is also preferable when using SARS-CoV-2 N2 crRNA.
In this discussion, we implicitly assumed that the more stable the equilibrium state of the Cas12a-crRNA-DNA complex, the more favorable it is for DNA detection. However, this assumption lacks sufficient evidence, and further investigation is needed to determine whether it is indeed correct. Additionally, there are residues that exhibit notably high RMSF values (for instance, around residue 600 for ds-template2, and around residue 1100 for both ds-template1 and ds-template2 in Figure 4.). Residues with high RMSF values may be involved in binding interactions. Further research is needed to explore the relationship between the differences in DNA and the residues with relatively high RMSF values, and this could potentially lead to the design of crRNA that reduces RMSF.
We create an MD simulation manual to help future iGEM teams run MD simulations smoothly. The manual describes how to use GROMACS 3 and GENESIS 12, 13, which we used.
The manual is as follows,
Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., Ronneberger, O., Willmore, L., Ballard, A. J., Bambrick, J., Bodenstein, S. W., Evans, D. A., Hung, C.-C., O’Neill, M., Reiman, D., Tunyasuvunakool, K., Wu, Z., Žemgulytė, A., Arvaniti, E., Beattie, C., … Jumper, J. M. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 630493-500. https://doi.org/10.1038/s41586-024-07487-w
Abraham, M., Alekseenko, A., Basov, V., Bergh, C., Briand, E., Brown, A., Doijade, M., Fiorin, G., Fleischmann, S., Gorelov, S., Gouaillardet, G., Gray, A., Irrgang, M. E., Jalalypour, F., Jordan, J., Kutzner, C., Lemkul, J. A., Lundborg, M., Merz, P., … Lindahl, E. (2024). GROMACS 2024.3 Source code (2024.3). Zenodo. https://doi.org/10.5281/zenodo.13456374
M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E. Lindahl (2015). GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX, 119-25. https://doi.org/10.1016/j.softx.2015.06.001
Páll, S., Abraham, M.J., Kutzner, C., Hess, B., Lindahl, E. (2015). Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS.In S. Markidis & E. Laure (Eds.), Solving Software Challenges for Exascale, 87593-27. https://doi.org/10.1007/978-3-319-15976-8_1
S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R. Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess, and E. Lindahl. (2013). GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit.Bioinformatics, 29845-854. https://doi.org/10.1093/bioinformatics/btt055
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl. (2008). GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation.J. Chem. Theory Comput., 4(3), 435-447. https://doi.org/10.1021/ct700301q
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C. Berendsen. (2005). GROMACS: Fast, Flexible and Free.J. Comp. Chem., 26 1701-1719. https://doi.org/10.1002/jcc.20291
E. Lindahl and B. Hess and D. van der Spoel. (2001). GROMACS 3.0: A package for molecular simulation and trajectory analysis.J. Mol. Mod., 7 306-317. https://doi.org/10.1007/s008940100045
H. J. C. Berendsen, D. van der Spoel and R. van Drunen. (1995). GROMACS: A message-passing parallel molecular dynamics implementation.Comp. Phys. Comm., 91 43-56. https://doi.org/10.1016/0010-4655(95)00042-E
MDAnalysis 4.2.4. Calculating root mean square quantities. https://docs.mdanalysis.org/1.1.1/documentation_pages/analysis/rms.html
Smith, P., Ziolek, R. M., Gazzarrini, E., Owen, D. M., & Lorenz, C. D. (2019). On the interaction of hyaluronic acid with synovial fluid lipid membranes.Physical Chemistry Chemical Physics, 19. https://doi.org/10.1039/C9CP01532A
Kobayashi, C., Jung, J., Matsunaga, Y., Mori, T., Ando, T., Tamura, K., ... & Sugita, Y. (2017). GENESIS 1.1: A hybrid‐parallel molecular dynamics simulator with enhanced sampling algorithms on multiple computational platforms. Journal of Computational Chemistry, 38, 2193-2206. http://dx.doi.org/10.1002/jcc.24874
Jung, J., Mori, T., Kobayashi, C., Matsunaga, Y., Yoda, T., Feig, M., & Sugita, Y. (2015). GENESIS: a hybrid‐parallel and multi‐scale molecular dynamics simulator with enhanced sampling algorithms for biomolecular and cellular simulations. Wiley Interdisciplinary Reviews: Computational Molecular Science, 5(4), 310-323. https://doi.org/10.1002/wcms.1220