Team:TU Darmstadt/Modeling
From 2012.igem.org
Home | | Team | | Official Team Profile | | Project | | Parts Submitted to the Registry | | Modeling | | Notebook | | Safety | | Attributions |
---|
If you choose to include a Modeling page, please write about your modeling adventures here. This is not necessary but it may be a nice list to include.
Contents |
Modeling
Homologie Modeling
While our proteins are functionally described in literature and during the IGEM competition, no structures are available in the protein data bank. For further work and visualizations protein structures are indispensible. We used Yasara Structure [1] to calculate 3-dimensional structures of our proteins we used within the IGEM.
Workflow
Description how our Yasara scripts calculates homology model[7]:
- Sequence is PSI-BLASTed against Uniprot [2]
- Calculation of a a position-specific scoring matrix (PSSM) from related sequences
- Using the PSSM to search the PDB for potential modeling templates
- The Templates are ranked based on the alignment score and the structural quality[3]
- Deriving additional information’s for template and target (prediction of secondary structure, structure-based alignment correction by using SSALN scoring matrices [4].
- A graph of the side-chain rotamer network is built, dead-end elimination is used to find an initial rotamer solution in the context of a simple repulsive energy function [5]
- The loop-network is optimized using a high amount of different orientations
- Side-chain rotamers are fine-tuned considering electrostatic and knowledge-based packing interactions as well as solvation effects.
- An unrestrained high-resolution refinement with explicit solvent molecules is run, using the latest knowledge-based force fields[6].
Application
All these steps are performed to every template used for the modeling approach. For our project we set the maximum amount of templates to 20. Every derived structure is evaluated using an average per-residue quality Z-scores. At least a hybrid model is built containing the best regions of all predictions. This procedure make prediction’s accurate and thus more realistic.
Results
PnB-Esterase
AroY
TphA1
TphA2
TphA3
TphA2
References
[1] E. Krieger, G. Koraimann, and G. Vriend, “Increasing the precision of comparative models with YASARA NOVA--a self-parameterizing force field.,” Proteins, vol. 47, no. 3, pp. 393–402, 2002.
[2] S. F. Altschul, T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman, “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.,” Nucleic Acids Res, vol. 25, no. 17, pp. 3389–3402, Sep. 1997.
[3] R. W. Hooft, G. Vriend, C. Sander, and E. E. Abola, “Errors in protein structures.,” Nature, vol. 381, no. 6580. Nature Publishing Group, p. 272, 1996.
[4] D. T. Jones, “Protein secondary structure prediction based on position-specific scoring matrices,” Journal of Molecular Biology, vol. 292, no. 2, pp. 195–202, 1999.
[5] A. A. Canutescu, A. A. Shelenkov, and R. L. Dunbrack, “A graph-theory algorithm for rapid protein side-chain prediction.,” Protein Science, vol. 12, no. 9, pp. 2001–2014, 2003.
[6] E. Krieger, K. Joo, J. Lee, J. Lee, S. Raman, J. Thompson, M. Tyka, D. Baker, and K. Karplus, “Improving physical realism, stereochemistry, and side-chain accuracy in homology modeling: Four approaches that performed well in CASP8.,” Proteins, vol. 77 Suppl 9, no. June, pp. 114–122, 2009.
[7] http://www.yasara.org/homologymodeling.htm
Information Theory
Docking Simulations
Gaussian network model
Theory
Nearly all biologically important processes such as enzyme catalysis,ligand binding and allosteric regulation occur on a large time-scale (micro- to millisecond). A Gaussian network model (GNM) is a coarse-grained representation of a protein as an network consisting of balls and springs. In our approach, proteins are represented by balls corresponding to the CA –atom of each residue[1] . While Molecular Dynamics (MD) simulations are computational expensive, a GNM calculation only needs a few seconds.
Computation
The dynamics of the structure in the GNM is described by the topology of contacts within the Kirchhoff matrix G. Thus in this network of N interacting sites, the elements of G are computed as:
where Rij is the distance between point i and j. We used Gamma as the intra CA-contact matrix. The inverse of it describes correlations between fluctuations within the proteins native state. The diagonal of the matrix is replaced by the sum of contacts of one CA-atom within the whole protein. After a singular value decomposition (SVD) we have calculated the normal modes of the protein. Slow modes describe functionally relevant residues within a biomolecule[2]. The opposite, Fast modes, represent an uncorrelated motion without significant changes in the structure.
A recent examination of the X-ray crystallographic B-factors of over 100 proteins showed that the GNM closely reproduces the experimental data [3].
Application to our Proteins
We computed the GNM in R [4] by using the BioPhysConnectoR [5] library.
- pnB-Esterase
- Fusarium solani cutinase
References
[1] A. R. Atilgan, S. R. Durell, R. L. Jernigan, M. C. Demirel, O. Keskin, and I. Bahar, “Anisotropy of fluctuation dynamics of proteins with an elastic network model.,” Biophys J, vol. 80, no. 1, pp. 505–515, Jan. 2001.
[2] C. Chennubhotla, A. J. Rader, L.-W. Yang, and I. Bahar, “Elastic network models for understanding biomolecular machinery: from enzymes to supramolecular assemblies.,” Physical Biology, vol. 2, no. 4, pp. S173–S180, 2005.
[3] I. Bahar and A. J. Rader, “Coarse-grained normal mode analysis in structural biology.,” Current Opinion in Structural Biology, vol. 15, no. 5, pp. 586–592, 2005.
[4] R. D. C. Team, “R: A Language and Environment for Statistical Computing.” Vienna, Austria, 2008.
[5] F. Hoffgaard, P. Weil, and K. Hamacher, “BioPhysConnectoR: Connecting sequence information and biophysical models.,” BMC Bioinformatics, vol. 11, p. 199, 2010.
Molecular Dynamics
Svens sandbox...