Team:TU Darmstadt/Modeling

From 2012.igem.org

(Difference between revisions)
(Normalisation)
 
(30 intermediate revisions not shown)
Line 27: Line 27:
    <li><a href="/Team:TU_Darmstadt/Safety" title="Safety">Safety</a></li>
    <li><a href="/Team:TU_Darmstadt/Safety" title="Safety">Safety</a></li>
    <li><a href="/Team:TU_Darmstadt/Downloads" title="Downloads">Downloads</a></li></ul></li>
    <li><a href="/Team:TU_Darmstadt/Downloads" title="Downloads">Downloads</a></li></ul></li>
-
<li><a href="/Team:TU_Darmstadt/Human_Practice" title="Human Practice">Human Practice</a><ul>
+
<li><a href="/Team:TU_Darmstadt/Human_Practice" title="Human Practice">Human Practice</a></li>   
-
    <li><a href="/Team:TU_Darmstadt/Human_Practice/Panel_Discussion " title="Panel Discussion">Panel Discussion</a></li>
+
-
    <li><a href="/Team:TU_Darmstadt/Human_Practice/Symposia" title="Symposia">Symposia</a></li>
+
-
    <li><a href="/Team:TU_Darmstadt/Human_Practice/Classes" title="Classes">Classes</a></li></ul></li>   
+
<li><a href="/Team:TU_Darmstadt/Sponsors" title="Sponsors">Sponsors</a><ul>
<li><a href="/Team:TU_Darmstadt/Sponsors" title="Sponsors">Sponsors</a><ul>
    <li><a href="/Team:TU_Darmstadt/Sponsors" title="Sponsors">Overview</a></li>
    <li><a href="/Team:TU_Darmstadt/Sponsors" title="Sponsors">Overview</a></li>
Line 40: Line 37:
</html>
</html>
-
{|
 
-
!align="center"|[[Team:TU_Darmstadt|Home |]]
 
-
!align="center"|[[Team:TU_Darmstadt/Team|Team | ]]
 
-
!align="center"|[https://igem.org/Team.cgi?year=2012&team_name=TU_Darmstadt Official Team Profile | ]
 
-
!align="center"|[[Team:TU_Darmstadt/Project|Project | ]]
 
-
!align="center"|[[Team:TU_Darmstadt/Parts|Parts Submitted to the Registry | ]]
 
-
!align="center"|[[Team:TU_Darmstadt/Modeling|Modeling | ]]
 
-
!align="center"|[[Team:TU_Darmstadt/Notebook|Notebook | ]]
 
-
!align="center"|[[Team:TU_Darmstadt/Safety|Safety | ]]
 
-
!align="center"|[[Team:TU_Darmstadt/Attributions|Attributions]]
 
-
|}
 
-
<!-- *** What falls between these lines is the Alert Box!  You can remove it from your pages once you have read and understood the alert *** -->
+
{| align="left"
 +
| [[File:ho_modeling_ic.gif|250px|link=https://2012.igem.org/Team:TU_Darmstadt/Modeling_Homologie_Modeling ]]
 +
|}__TOC__
-
<html>
+
{| align="right"
-
<div id="box" style="width: 700px; margin-left: 137px; padding: 5px; border: 3px solid #000; background-color: #fe2b33;">
+
| [[File:MD_ic.gif|250px|link=https://2012.igem.org/Team:TU_Darmstadt/Modeling_MD]]
-
<div id="template" style="text-align: center; font-weight: bold; font-size: large; color: #f6f6f6; padding: 5px;">
+
|}__TOC__
-
This is a template page. READ THESE INSTRUCTIONS.
+
-
</div>
+
-
<div id="instructions" style="text-align: center; font-weight: normal; font-size: small; color: #f6f6f6; padding: 5px;">
+
-
You are provided with this team page template with which to start the iGEM season.  You may choose to personalize it to fit your team but keep the same "look." Or you may choose to take your team wiki to a different level and design your own wiki.  You can find some examples <a href="https://2008.igem.org/Help:Template/Examples">HERE</a>.
+
-
</div>
+
-
<div id="warning" style="text-align: center; font-weight: bold; font-size: small; color: #f6f6f6; padding: 5px;">
+
-
You <strong>MUST</strong>  have all of the pages listed in the menu below with the names specified.  PLEASE keep all of your pages within your teams namespace. 
+
-
</div>
+
-
</div>
+
-
</html>
+
-
 
+
-
<!-- *** End of the alert box *** -->
+
-
 
+
-
 
+
-
If you choose to include a '''Modeling''' page, please write about your modeling adventures here.  This is not necessary but it may be a nice list to include.
+
-
 
+
-
== Modeling ==
+
-
==Homologie Modeling==
+
-
While our proteins are functionally described in literature and during the IGEM competition, no structures are available in the protein data bank. For further work and visualizations protein structures are indispensible.  We used Yasara Structure [1]⁠  to calculate 3-dimensional structures of our proteins we used within the IGEM.
+
-
 
+
-
===Workflow===
+
-
Description how our Yasara scripts calculates homology model[7]:
+
-
[[File:Aln+pnB.png|Alignment with an homologie model|right|500px]]
+
-
# Sequence is PSI-BLASTed against Uniprot [2]⁠
+
-
# Calculation of  a  a position-specific scoring matrix (PSSM) from related sequences
+
-
# Using the PSSM to search the PDB for potential modeling templates
+
-
# The Templates are ranked based on the alignment score and the structural quality[3]⁠
+
-
# Deriving additional information’s  for template and target (prediction of secondary structure, structure-based alignment correction by using SSALN scoring matrices [4]⁠.
+
-
# A graph of the side-chain rotamer network is built, dead-end elimination is used to find an initial rotamer solution in the context of a simple repulsive energy function [5]⁠
+
-
# The loop-network is optimized using a high amount of different orientations
+
-
# Side-chain rotamers are fine-tuned considering electrostatic and knowledge-based packing interactions as well as solvation effects.
+
-
# An unrestrained high-resolution refinement with explicit solvent molecules is run, using the latest knowledge-based force fields[6]⁠.
+
-
===Application===
+
-
All these steps are performed to every template used for the modeling approach. For our project we set the maximum amount of templates to 20. Every derived structure is evaluated using an average per-residue quality Z-scores. At least a hybrid model is built containing the best regions of all predictions. This procedure make prediction’s accurate and thus more realistic.
+
-
===Results===
+
-
 
+
-
====PnB-Esterase====
+
-
====AroY====
+
-
====TphA1====
+
-
====TphA2====
+
-
====TphA3====
+
-
====TphA2====
+
-
 
+
-
===References===
+
-
[1] E. Krieger, G. Koraimann, and G. Vriend, “Increasing the precision of comparative models with YASARA NOVA--a self-parameterizing force field.,” Proteins, vol. 47, no. 3, pp. 393–402, 2002.
+
-
 
+
-
[2] S. F. Altschul, T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman, “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.,” Nucleic Acids Res, vol. 25, no. 17, pp. 3389–3402, Sep. 1997.
+
-
 
+
-
[3] R. W. Hooft, G. Vriend, C. Sander, and E. E. Abola, “Errors in protein structures.,” Nature, vol. 381, no. 6580. Nature Publishing Group, p. 272, 1996.
+
-
 
+
-
[4] D. T. Jones, “Protein secondary structure prediction based on position-specific scoring matrices,” Journal of Molecular Biology, vol. 292, no. 2, pp. 195–202, 1999.
+
-
 
+
-
[5] A. A. Canutescu, A. A. Shelenkov, and R. L. Dunbrack, “A graph-theory algorithm for rapid protein side-chain prediction.,” Protein Science, vol. 12, no. 9, pp. 2001–2014, 2003.
+
-
 
+
-
[6] E. Krieger, K. Joo, J. Lee, J. Lee, S. Raman, J. Thompson, M. Tyka, D. Baker, and K. Karplus, “Improving physical realism, stereochemistry, and side-chain accuracy in homology modeling: Four approaches that performed well in CASP8.,” Proteins, vol. 77 Suppl 9, no. June, pp. 114–122, 2009.
+
-
 
+
-
[7] http://www.yasara.org/homologymodeling.htm
+
-
 
+
-
==Information Theoretical Analysis==
+
-
===Information Theory===
+
-
====Entropy====
+
-
Claude Shannon created a new measurement approach of uncertainty of a random variable X. This measurement is called Shannon’s entropy H [1] which is measured in bit, if a logarithm to the base 2 is used. p(x) denotes the probability mass function of a random variable X.
+
-
 
+
-
[[File:DKL.png|Entropy|center|300px]]
+
-
 
+
-
====Mutual Information====
+
-
In information theory, Mutual information (MI) is a correlations measure of two random variables X and Y . H(X) and H(Y ) are the Shannon entropy values of the random variables X and Y. H(X, Y ) is the two-point entropy. Moreover , the MI quanti?es the amount of information of variable X by knowing Y and vice versa.
+
-
[[File:MI.png|Mutual Information|center|300px]]
+
-
===Application of MI to sequence Alignments===
+
-
It is well known that the MI can be used to measure co-evolution signals in multiple sequence alignments (MSA)[2] [3] . An MSA serves as a comparison of three or more sequences used to investigate the functional or evolutionary homology of amino acid or nucleotide sequences. The MI of an MSA can be computed with the following equation derived from the Kullback-Leibler-Divergence (DKL):
+
-
 
+
-
[[File:MI_DKL.png|center|350px|DKL]]
+
-
 
+
-
with p(x) and p( y) being the frequency counts of symbols in column X and Y of the MSA. The joint frequency describe the occurrence for the amino acids xi and yj(p(x, y)) and Q is the set of  Symbols derived from the corresponding alphabet (DNA or Protein). The result of these calculations is a symmetric matrix M which includes all combined MI values for any two columns in an MSA. A dependency of two columns acids shows high MI values.
+
-
 
+
-
[[File:Ali2MI.png|700px|center]]
+
-
 
+
-
===Normalisation===
+
-
A standard score (Z-score) indicates how many standard deviations a value differs from the mean of a normal distribution. MI dependent Z-scores can be calculated with a shuffle-null model, where the symbols in MSA column are shuffled and every dependencies of the column pairs are eliminated. The expectation value for the shuffle-null model is described by E(Mi j) and its corresponding variance by Var(Mi j) [4].
+
-
 
+
-
[[File:z_score.png|center|250px|Z_score]]
+
-
 
+
-
==Docking Simulations==
+
-
== Gaussian network model ==
+
-
===Theory===
+
-
Nearly all biologically important processes such as enzyme catalysis,ligand binding and allosteric regulation occur on a large time-scale (micro- to millisecond). A Gaussian network model (GNM) is  a coarse-grained  representation of a protein as an network consisting of  balls and springs. In our approach, proteins are represented by balls corresponding to the CA –atom of each residue[1]⁠ . While Molecular Dynamics (MD) simulations  are computational expensive,  a GNM calculation only needs a few seconds.
+
-
 
+
-
===Computation===
+
-
The dynamics of the structure in the GNM is described by the topology of contacts within the Kirchhoff matrix G. Thus in this  network of N interacting sites, the elements of G are computed as:
+
-
 
+
-
[[File:GNM.formel.png|500px|center]]
+
-
 
+
-
where Rij is the distance between point i and j. We used Gamma as  the intra CA-contact matrix. The inverse of it describes correlations between fluctuations within the proteins native state. The diagonal of the matrix is replaced by the sum of contacts of one CA-atom within the whole protein. After a singular value decomposition (SVD) we have calculated the normal modes of the protein. Slow modes describe functionally relevant residues within a biomolecule[2]⁠. The opposite,  Fast modes, represent an uncorrelated motion without significant changes in the structure.
+
-
 
+
-
[[File:GNM.gif|700px|center|Structure to GNM]]
+
-
 
+
-
A recent examination of the X-ray crystallographic B-factors of over 100 proteins showed that the GNM closely reproduces the experimental data [3]⁠.
+
-
===Application to our Proteins===
+
-
We computed the GNM in [https://2012.igem.org/Team:TU_Darmstadt/Protocols/R_Programming R] [4]⁠ by using the BioPhysConnectoR [5]⁠ library.
+
-
 
+
-
* pnB-Esterase
+
-
* Fusarium solani cutinase
+
-
 
+
-
===References===
+
-
 
+
-
[1] A. R. Atilgan, S. R. Durell, R. L. Jernigan, M. C. Demirel, O. Keskin, and I. Bahar, “Anisotropy of fluctuation dynamics of proteins with an elastic network model.,” Biophys J, vol. 80, no. 1, pp. 505–515, Jan. 2001.
+
-
 
+
-
[2] C. Chennubhotla, A. J. Rader, L.-W. Yang, and I. Bahar, “Elastic network models for understanding biomolecular machinery: from enzymes to supramolecular assemblies.,” Physical Biology, vol. 2, no. 4, pp. S173–S180, 2005.
+
-
 
+
-
[3] I. Bahar and A. J. Rader, “Coarse-grained normal mode analysis in structural biology.,” Current Opinion in Structural Biology, vol. 15, no. 5, pp. 586–592, 2005.
+
-
 
+
-
[4] R. D. C. Team, “R: A Language and Environment for Statistical Computing.” Vienna, Austria, 2008.
+
-
 
+
-
[5] F. Hoffgaard, P. Weil, and K. Hamacher, “BioPhysConnectoR: Connecting sequence information and biophysical models.,” BMC Bioinformatics, vol. 11, p. 199, 2010.
+
-
 
+
-
==Molecular Dynamics==
+
 +
{| align="center"
 +
| [[File:GNM_ic.gif|200px|link=https://2012.igem.org/Team:TU_Darmstadt/Modeling_GNM]]
 +
|}__TOC__
 +
{| align="right"
 +
| [[File:docking_ic.gif|250px|link=https://2012.igem.org/Team:TU_Darmstadt/Modeling_Docking]]
 +
|}__TOC__
-
Svens sandbox...
+
{| align="left"
 +
| [[File:IT_ic.gif|250px|link=https://2012.igem.org/Team:TU_Darmstadt/Modeling_IT ]]
 +
|}__TOC__

Latest revision as of 12:47, 26 September 2012