In this year’s project, we try to make reasonable design of riboscaffolds by ourselves. Due to the current experimental conditions, we could not know what exactly happens to our riboscaffolds in the cell. Its folding process, binding with proteins and ligand, and the allosteric transition are of interests, since these are considerations for a reasonable design. Luckily, we have simulation tools for molecular modeling which can provide hints about what may happen theoretically in the intracellular environment. Although molecular modeling is often considered as professional work in which knowledge of different disciplines such as biology, physics, chemistry, mathematics and computer science as well as previous experience all contribute to good results, our team is willing to take a step in this field and get preliminary results.
RNA folding and 3D Structures
Introduction
For most small RNA molecules, secondary structure is enough for predicting their possible functions since their size limits their ability to fold into complex structures. However, our riboscaffold is a large one of 100~160 bp and has several arms. As in addition, our design of kissing loop is on the level of tertiary structure. So it is helpful if we can simulate RNA folding in silico.
Methods
NAST/C2A [10] is a set of Python tools that can generate full-atomic 3D RNA structures from secondary structure information. NAST generates coarse grained 3D structures from secondary structure information, and C2A adds the full-atomic details to these coarse-grained models. PyOpenMM [11] was required, VMD [12] and PyMOL [13] was used to view the results and trajectories.
Beginning with sequences and secondary structures, NAST generates 3D structures in two different ways. The first is to start an MD simulation from an unfolded circle state. The second is a more sophisticate approach that starts at a random position in the sequence, and adds residues to each end at a random plausible position. Details about the sequences and design strategy can be found here.
Results
The following two movies about the folding process of D0 explain the two different ways of 3D structure prediction provided by NAST.
Movie1. Start an MD simulation from an unfolded circle state.
Movie2. Starts at a random position in the sequence, and adds residues to each end at a random plausible position.
According to the document of NAST, the second method (random unfold starting structure) is more sophisticate, so in the following movies, we use this method to predict the folding of the riboscaffolds we designed.
Figure 1. Clover1 3D structure Movie3. Clover1 Folding.
Figure 2 Colver2-1 structure Movie4. One possible folding process of Clover2.
In this conformation, the interaction between theophylline aptamer and MS2 aptamer is not obvious, probably because they extend from the same loop, which is too small to be flexible.
Figure 3 Colver2-1 structure Movie5. Another possible folding process of Clover2.
In this conformation, the interaction between theophylline aptamer can be seen during the folding process, but as the MS2 aptamer extends, the internal rigidity pretends the torsion in the structure.
Figure 4 Colver3 structure. Movie6. Clover3 folding.
In this movie, we can see the interaction between theophylline aptamer and MS2 aptamer is more obvious, which begins from the formation of MS2 and remains in the “mature” Clover3.
Figure 5 Clover3 mutant. Movie7. Clover3 mutant folding
To demonstrate that the interaction in Clover3 is caused by our design, we simulate the 3D structure of a mutant of Clover 3, in which the theophylline aptamer remains unmutant so it cannot base pair with the sequence of MS2.
Docking
Introduction
Docking is a method which predicts the preferred orientation and binding affinity of one molecule to a second when bound to each other to form a stable complex. Docking is frequently used to predict the binding orientation of small molecule drug candidates to their protein targets in order to in turn predict the affinity and activity of the small molecule. Hence docking plays an important role in the rational design of drugs.
In our cases, riboscaffolds are designed as targets of small ligands, so we can dock the ligand to the riboscaffold to find the binding site and evaluate the binding state.
Methods
Although most docking algorithms and tools, as well as tutorials and documents are designed for drug-protein docking, we managed to find several approaches to conduct docking on RNA aptamers[1,2], which is a heating ground in the research of RNA targeting drugs. Among the popular docking tools, AutoDock package [3] has demonstrated its ability to dock compounds to RNA molecules [1]. In our approach, we docked theophylline to our riboscaffolds (with the predicted 3D structure) to search for the binding sites and construct ligand-riboscaffold complex for dynamic analysis.
Step1: Preparation of riboscaffold
The coordinate files predicted by Nast, refined by C2A, balanced and optimized by OpenMM Zephyr were used for docking. But they can still have potential problems that need to be corrected. The preparation was done in AutoDock Tools, including merging the non-polar hydrogen atoms on RNA molecules and computing Gasteiger charges. Then, the PDB files were converted to PDBQT files used in AutoDock.
Step2: Preparation of theophylline.
The ligand theophylline was obtained from chEBI database (www.ebi.ac.uk/chebi/) as MOL file and converted to PDB coordinate file using OpenBable [4]. Preparation was performed also in AutoDock Tools, and because the molecule of theophylline is quite rigid, there is no need to specify rotatable bonds and active torsions, just used the default settings and generated PDBQT file.
Step3: Run docking with AutoDock.
Once the receptor and ligand are prepared, we can use the AutoDock 4.2 package. The grid box for site searching was created on different arms on our riboscaffolds according to their dimensions with the default 0.375 angstrom for spacing between the grid points. For each riboscaffold, 20 docking experiments were performed using the Lamarckian genetic algorithm conformational search, with the population size of 150, one million energy evaluations, and a maximum of 27 000 generations per run.
Results
Figure6. This figure shows the docking result of the three arms (theophylline aptamer, MS2 aptamer and PP7 aptamer). Ligands are in white.
There are more binding sites of theophylline on the theophylline aptamer than on the other two aptamers, which indicates the present of theophylline is highly probable to have some effects on the theophylline aptamer. And the effects caused by theophylline is mainly through its interation with theophylline aptamer, which is consistent with our experimental results.
The docking results were used to generate riboscaffold-theophylline complex for molecular dynamics analysis of the whole system.
Molecular Dynamics
Introduction
Molecular dynamics (MD) is a computer simulation of physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a period of time, giving a view of the motion of the atoms by numerically solving the Newton's equations. The results of molecular dynamics simulations includes the trajectory of molecules and atoms in the system and other parameters such as energy potential, temperature and pressure fluctuations, number of hydrogen bonds of the system, which may be used to find the properties of such complex systems analytically.
Methods
Here in our project, we use GROMACS 4.5 [5] to run MD and analyze the state of theophylline-riboscaffold complex. GROMACS is a versatile package to perform molecular dynamics on macromolecules like proteins, lipids and nucleic acids. It is fast, has multiple choices for force field, and comes with a large selection of flexible tools for trajectory analysis.
Although GROMACS provides step by step tutorials for beginners, they are all for proteins or protein-ligand complex. Our riboscaffold-theophylline system is similar to protein-ligand complex to some degree, but they have their own characteristics. We, as beginners, are trying to develop a protocol for the MD simulation of ligand sensible RNA devices based on canonical tutorials for protein-ligand complex [6, 7]. It may be coarse now, and we will modify it and add more details after more experiments, we hope it is helpful for people who are also beginners and interested in RNA molecular dynamics.
Step1: Separate the RNA and ligand from a complex file.
We used the complex file generated by AutoDock and separate them into two PDB files with the coordinate of atoms conserved.
Step2: Prepare the protein topology with pdb2gmx.
Choose AMBER force fields because they are more suitable for RNA MD. In our cases, we chose AMBER99SB.
Step3: Prepare the ligand topology using external tools.
Proper treatment of ligands is one of the most challenging tasks in molecular simulation. The ligand should be treated with the force field that is consistent with the receptor. Unlike the tutorials for protein-ligand complex, in which the PRODRG web server with GROMOS force field is used, we prepared our ligand with Antechamber [8] and tleap [9], free tools in the AmberTools package. The output files are a top and a gro file for the ligand. The top file should be converted into itp file for reusability and included in the receptor top file.
Step4: Solvation
At this point, it is necessary to define the unit cell and fill it with water, just as described in other tutorials for any other MD simulation.
Step5: Adding ions
A .tpr file must be generated for the input. Therefore, we will setup energy minimization, which means running grompp before genion. Energy minimization in this step is different form that in the following step, here we just want to know the charges of our system. Our system has a net charge of -157e, so we add 157 Na+ to simulate the status in cytoplasm.
Step6: Energy Minimization
Having added ions, we need to rerun energy minimization. In the .mdp file, remember to set energygrps to the right names of the elements in the system. It is recommended to checked the [molecule] part of the .top file to make sure all kinds of molecules in the system are correctly included.
Step7: Equilibration
Similar to the protein-ligand complex, in the stage of equilibration, we should first restrain our ligand (but it is not fixed!) around the binding site during equilibration, control the temperature coupling, and run two phases of equilibration, NVT (canonical ensemble, moles N, volume V and temperature T are conserved.) and NPT (isothermal–isobaric ensemble, in which moles N, pressure P and temperature T are conserved.)
Step8: Production MD
After two phases of equilibration, the system has reached the desired temperature and pressure. So we will run MD again and get the products.
Results
Figure 7. The potential energy of the system.
The potential energy equilibrates about an average of ~ –3274000 kJ/mol.
Figure 8. Temperature fluctuation.
Figure 9. Pressure fluctuation.
The temperature and pressure plot shows the normal oscillation behavior. The average temperature is about 310 K, which is what we desired.
Figure 10. Hydrogen bonds.
The number of hydrogen bonds ranges mainly between 3 and 4. This figure shows the interaction is rather stable.
Figure 11. RMSD of RNA.
Figure 12. RMSD of theophylline.
The two figures about RMSD indicates that the RNA, as a receptor, equilibrates rather quickly. But the small ligand, theophylline, equilibrates slower, with a larger range of fluctuation (about 0.02, but in RNA, the range is within 0.05) in conformation. This is normal in ligand-receptor systems.
In conclusion, the preliminary analysis results show that the simulation system generated by our workflow acts similar to other ligand-receptor systems under the environment of molecular dynamics. Actually, with this method, we can conduct more analysis on the level of molecules and even atoms. We will continue our way to explore this magic microworld!
References
[1] Yaroslav Chushak, Morley O. Stone. In silico selection of RNA aptamers. Nucl. Acids Res. (2009) 37(12).( http://nar.oxfordjournals.org/content/37/12/e87.short)
[2] Florent Barbault. RNA as drug target: docking studies.
[3] Morris, G. M., Huey, R., Lindstrom, W., Sanner, M. F., Belew, R. K., Goodsell, D. S. and Olson, A. J. (2009) Autodock4 and AutoDockTools4: automated docking with selective receptor flexiblity. J. Computational Chemistry, 16: 2785-91.
[4] N M O'Boyle, M Banck, C A James, C Morley, T Vandermeersch, and G R Hutchison. "Open Babel: An open chemical toolbox." J. Cheminf. (2011), 3, 33.
[5] Hess, et al. (2008) GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 4: 435-447.
[6] John E. Kerrigan. GROMACS Tutorial for Drug – Enzyme Complex.
[7] Justin Lemkul. GROMACS Tutorial: Protein-Ligand Complex. (http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin/gmx-tutorials/complex/index.html)
[8] Wang, J., Wang, W., Kollman P. A.; Case, D. A. "Automatic atom type and bond type perception in molecular mechanical calculations". Journal of Molecular Graphics and Modelling , 25, 2006, 247260.
[9] Christian E. A. F. Schafmeister, Wilson S. Ross and Vladimir Romanovski, , University of California, San Francisco (1995).
[10] Jonikas MA, Radmer RJ, Laederach A, Das R, Pearlman S, Herschlag D, Altman RB. Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters. RNA. 2009 Feb;15(2):189-99. PMID: 19144906 (2009).
[11] M. S. Friedrichs, P. Eastman, V. Vaidyanathan, M. Houston, S. LeGrand, A. L. Beberg, D. L. Ensign, C. M. Bruns, V. S. Pande. “Accelerating Molecular Dynamic Simulation on Graphics Processing Units.” J. Comp. Chem., 30(6), 864-872. (2009) (https://simtk.org/home/pyopenmm)
[12] Humphrey, W., Dalke, A. and Schulten, K., "VMD - Visual Molecular Dynamics", J. Molec. Graphics, 1996, vol. 14, pp. 33-38. (http://www.ks.uiuc.edu/Research/vmd/)
[13] The PyMOL Molecular Graphics System, Version 1.5.0.4 Schr?dinger, LLC.