It is extremely useful to be able to predict the three-dimensional ("tertiary") structure of a protein, if we only know the two-dimensional (primary) structure. To determine the three dimensional structure by experiment is very tedious. The protein must be made, purified, crystallized; then its tertiary structure is solved by XRD (X-ray diffraction) or NMR (nuclear magnetic resonance). This takes a very long time. It is important to know the tertiary structure, because that tells us how the protein works in the body or in nature. If the shape of a protein is changed, sometimes only by a little, it does not "work" at all.
We know the primary structures of ALL the proteins in the human being, but we still are far from knowing the tertiary structures. Therefore, there is a lot of knowledge that we need to acquire.
If we could determine these tertiary structures by computer, we could learn much faster. It would also allow us to "design" proteins that have never yet existed "from scratch".
Even for a computer, this problem is often too complex. So we use simpler models. One simplification is assume that the protein folds to a cubic shape. Another is to assume that there are only two, or maybe five, different amino acids, rather than the twenty found in nature. Many proteins have hundreds of amino acids total, so we can begin with imaginary proteins that have only a total of 27.
Figure 1 - Ten "best" sequences for 3 x 3 x 3 lattice with five letter code, as determined by minimizing the Z score. Number 1 (at the top) has lowest Z score, and the Z increases going from top to bottom, and left to right.
There are 20 amino acids which form proteins, but we can group these amino acids by class. Effective, we "reduce" the number of amino acids, which simplifies the computational problem. The simplest such scheme divides the amino acids into the two classes polar (P) and hydrophobic (H).
This division is meaningful, since proteins fold in such a way as to put the amino acids that are polar on the outside of the folded protein, that is, in contact with the water solvent. The non-polar (hydrophobic) amino acids move to the inside of the folded protein, away from the solvent.
This HP (two class) system, while useful, is nevertheless limited. Another scheme (CHMB), divides the amino acids into five classes.
© 2003-20120 by Lawrence T. Sein. All rights reserved.
Send questions to: lseinjr@hotmail.com