Get help from the best in academic writing.

Protein–protein Interface Hot Spots

LITERATURE REVIEW Protein-protein interactions
Proteins corporate with other proteins to perform biological function. They physically interact with their partners through their interfaces with weak and non-covalent bonds. Understanding how proteins interact is crucial to predict unknown protein-protein interaction (PPI) and can help drug discovery. Thus, to discriminate the interface region from the rest of the surface, myriad of studies have analyzed some characteristics of interface region like hydrophobicity, solvent accessibility, shape and electrostatic complementarity, evolutionarily conservation, flexibility, residue propensities and hydrogen bonding.[1–3]
Hot Spots
Studies on protein interfaces have revealed that the free energy contributions of interfacial residues to binding are not uniformly distributed. A small subset of interfacial residues named hot spots is accounting for the majority of the binding free energy [4, 5]. Alanine Scanning Mutagenesis is an experimental method which is based on the fact that when a residue is mutated to alanine if it causes a significant drop in the binding free energy (∆∆G) then it is a hot spot. Hot spot information from experimental studies is deposited in the several databases such as the Alanine Scanning Energetics database (ASEdb) [6] and the binding interface database (BID) [7]. Unfortunately, experimental determination of hot spots is time-consuming, labor-intensive and has high economic costs. Therefore, there is a certain need for developing computational methods to identify hot spots [8].
Characteristics of Hot Spots
Bogan and Thorn reported that hot spots are abundant in Tryptophan (Trp) , Arginine (Arg) and Tyrosine(Tyr). On the other hand, some residues like Leucine (Leu), Methionine (Met), Serine (Ser), Threonine (Thr) and Valine (Val) are disfavored. The hot spots are surrounded by a ring of residues that are energetically less important and occlude bulk solvent from the hot spots (O-ring theory) [6]. “Double water exclusion” theory refines the “O-ring theory” and reveals that hot spots themselves are water-free.[9]
Keskin et al. showed that computational hot spots are not homogeneously distributed along the protein interfaces; rather they are clustered within locally tightly packed regions, called “hot regions”[10] and Cho et al. reached similar conclusion by showing significantly higher atomic packing density values of hot spots.[11] , del Sol and O‟Meara and Keskin et al. illustrated that there is a correlation between hot spot residues and structurally conserved residues so the conserved and buried residues, are either experimental hotspot or in direct contact with an experimental hotspot.[10, 12]
Kozakov et al. have recently demonstrated that hot spots are clustered in specific regions that are distinguishable due to their concave topology and those regions are patterned with hydrophobic and polar residues [13].
Understanding microenvironment of hot spots is also significant. Ye et al. demonstrated that Alanine (Ala), Aspartic Acid (D), Glycine (G), Histidine (H), Isoleucine (I), Asparagine (N), Serine (S) and Tyrosine (Y) are more likely to occur close by hot spots than non-hot spots.
General definition of hot spot is that it is a residue that causes significant change in the binding free energy when it is mutated to Alanine. However, there is no consensus in the literature for what value of change is significant. In the most recent studies on hot spot prediction, they used 1.0 kcal/mol ([14, 15]) and mostly 2.0 kcal/mol ([16–23]) as the threshold to differentiate hot spots and nonhot spots, whereas Ofran and Rost chose 2.5 kcal/mol threshold to define hot spots and 0 kcal/mol to define nonhot spots [24]. Tuncbag et al. defined hot spots to be interface residues with higher than 2.0 kcal/mol and nonhot spots to be the ones with lower than 0.4 kcal/mol. Chen et al. and Xia et al. also used same definition of Tuncbag et al. [25, 26] while Cho et al. used two different cutoff values for the definition in their study, that is, 1.0 and 2.0 kcal/mol. [11]
Hot Spot Prediction
In recent years, several computational methods have been developed to predict protein–protein interface hot spots. Some of these methods are energy-based such as Robetta [27] and FOLDEF function [28] while others are based on molecular dynamics simulations [29][30].
Ofran and Rost developed a knowledge-based method (ISIS) that was based on neural networks with features extracted from sequence environment and evolutionary profile and used ISIS to predict hot spots [24, 31]. Darnell et al. combined decision tree approach based on atomic contacts, physicochemical properties, and shape specificity with computational alanine scanning [17]. Grosdidier and Fernandez-Recio developed a method that predicted interface hot spots using protein-docking tools without protein complex knowledge [32]. Recently, Wang et al. presented structure-based computational approach that determines hot spots through the docking of a compound known as the inhibitor of the specific Protein-Protein Interaction (PPI).[16]
Feature-based method of Guney et al. identified hot spots using solvent accessible surface areas, residue conservation and residue propensity [33] and Tuncbag et al. presented empirical feature-based method by combining solvent accessibility and statistical pairwise residue potentials [34, 35]
Lise et al. combined the strengths of machine learning and energy-based methods by considering the basic energy terms as input features of machine learning models such as Support Vector Machines(SVMs) and Gaussian Processes[19, 36]. In their recent study, they integrated their approach with two additional SVM classifiers to overcome limitations on predictions involving arginine or glutamic acid residues.[19]
Machine learning approaches have also developed for hot spot prediction. Cho et al. proposed a hot spot prediction method based on protein structure, sequence and molecular interaction that used decision trees for feature selection and SVMs for classification.[11] Xia et al. also employed SVM classifiers with features such as protrusion index and solvent accessibility to predict hot spots[25]. Assi et al. applied Bayesian networks with energetic, structure-based and sequence-based features to predict hot spots [20]. Chen et al. generated sequence-based SVM model that utilized physicochemical features, Position Specific Scoring Matrix (PSSM), evolutionary conservation score, and sequence entropy[18]. Zhu and Mitchell built two knowledge-based hot spot prediction methods (KFC2a and KFC2b) based on different feature combinations using SVMs. [21] Wang et al. proposed a random forest model that took hybrid features of residues that were defined with residues itself and its interacting residues of the opposite chain [37]. Chen et al. predicts hot spots using IBk algorithm as a classifier with only physicochemical characteristics of residues. [26]. Furthermore, Ye et al. constructed the SVM model using network features and microenvironment features [38]
Ozbek et al. applied Gaussian Network Model (GNM) both on unbound and complex structures to predict hot spots.[39]

Cell-cell Interaction in Embryo Development

The formation of vulva depends upon a second round of cell-cell interaction. The anchor cell (located in the gonad) and six precursor cells (located in the skin adjacent to the gonad) are involved in this interaction. The precursor cells are collectively called Pn.p cells, which contains cells named as P3.p to P8.p. The fate of these cells is determined by its position relative to the anchor cell. The developmental pathways of these cells are presented in Fig.11.8.
During third larval stage, the lin-3 gene is activated in the anchor cell, and produces the signal protein LIN-3, which is related to vertebrate epidermal growth factor (EGF). The precursor cells express a receptor encoded by let-23 gene, which is homologous to the vertebrate EGF receptor. The binding of LIN-3 protein and LET-23 receptor, trigger a series of intracellular events that determines whether the precursor cells will form the primary vulval precursor cell or secondary vulval cells. Mutant let-23 gene sends no signal and thus Pn.p cells cannot act, and vulva is not formed.
Usually, P6.p cell, the closest cell to the anchor cell, receives the strongest signal initiated by LIN-3 binding to LET-23. Expression of the Vulvaless (Vul) gene (a mutant phenotype) in P6.p is activated by this signal, and then divides three times to produce vulva cells. The P5.p and P7.p, the two neighbouring cells, receives lower amount of signal, and divide asymmetrically to form additional vulva cells.
Thereafter, a third level of cell-cell interaction occurs, in which the primary vulval cell P6.p sends a signal that activates lin-12 gene in P5.p and P7.p cells. This signal prevents these cells from adopting the division pattern of the primary cell. Thus, cells in which both Vul and lin-12 are active cannot become primary vulva cells. On the other hand, P3.p, P4.p, and P8.p cells do not receive any signal from the anchor cell, but the Multivulva (Muv) gene is expressed. Muv gene product represses the expression of Vul gene and they develop as skin cells. Thus three levels of cell-cell interactions are involved in the developmental pathway leading to vulva formation in the nematode C. elegans.
Cell-cell interaction is an important phenomenon in the development of the embryo in eukaryotic organisms. Animals use a number of signalling pathway to regulate development after organogenesis. Signal networks establish anterior-posterior polarity and body axis, coordinate pattern formation, and direct the differentiation of tissues and organs.
One of the widely studied cell-cell interaction is Notch signalling pathway, named after the Drosophila mutants that were used to identify components of this pathway. The Notch gene encodes a transmembrane signal receptor (Fig.11.9). The signal itself is a transmembrane protein called “Delta”, and encoded by the gene Delta. The Notch signal system works only between adjacent cells. First the “Delta” protein binds to the Notch receptor, which triggers cleaving of the cytoplasmic tail of the Notch protein and then moves to the nucleus where it binds to a protein encoded by the Su(H) (suppressor of Hairless) gene. Following this a set of genes becomes activated that controls a specific developmental pathway directing cell fate.
One of the roles of the Notch signal system is to specify the fate of equivalent cells in a population. Thus action of Notch signalling system may send signal to two neighbouring cells that are developmentally equivalent, towards different developmental pathways. Four members of the Notch family (Notch 1 to Notch 4) have been identified in humans. Several human developmental disorders have been related to mutations in these genes. These include: alagille syndrome (AGS), spondylocostal dysostosis (SD), and lymphoblastic leukemia.
Stem cells are undifferentiated cells that are capable to differentiate into different types of specialized cells. Stem cells are normally found in two main sources: in embryos which are at blastocyst stage of embryological development (embryonic stem cells), and in adult tissues (adult stem cells). These cells are generally characterized by their potential to differentiate into different cell types, for example muscle, blood, skin, bone etc.
Human embryo that is in the blastocyst phase of development (4-5 days old) is the excellent source of embryonic stem cells. Formation of single cell zygote through fusion of male sperm with female’s egg is the beginning of sexual reproduction process. This is followed by a series of mitotic divisions in a single cell zygote which leads to the formation of a cell mass containing approximately 12-16 cells. This is known as blastocyst before it is implanted in uterus (4-6 days old). Blastocyst can be differentiated into an inner cell mass (embryoblast) and an outer cell mass (trophoblast). Trophoblast becomes the part of placenta and cells of embryoblast differentiate into all the structures of an adult organism. This embryoblast is the source of embryonic stem cells which are totipotent. During normal pregnancy, the blastocyst stage of embryo continues by the end of the tenth week of gestation.
When embryonic stem cells are extracted from the blastocyst stage and placed onto a culture medium (a nutrient-rich broth) contained in culture vessels, they divide and replicate, but fail to differentiate. This happens, as necessary stimulation to differentiate (in the in vivo conditions) is lacking in the in-vitro conditions. However, they maintain their ability to differentiate into different type of cells in human body.
Adult or somatic stem cells present throughout the body inside different type of tissues even after embryonic development. Tissues like, bone marrow, blood, blood vessels, brain, skeleton muscle, skin and the liver are good source of adult stem cells. These cells remain in resting state for years until activated by disease or tissue injury. Adult stem cells have property of division and self renewal which enables them to regenerate entire organ. Earlier it was believed that adult stem cells have the potential to differentiate only to the cell type of their originating tissue or organ, but according to some recent evidence they can differentiate to other cell types as well.
Embryonic stem cells are easier to grow under in-vitro conditions as compared to adult stem cells. For culturing of stem cells, they are extracted from either adult cells or from dividing zygotes. Once isolated, they can be cultured in culture dishes containing culture broth under controlled conditions. The nutrient broth allows them to divide and replicate, but prohibits them from further specializing or differentiating. Once proliferation of stem cells starts successfully, they are subcultured on fresh medium in order to enhance the growth rate. The collection of healthy, dividing, and undifferentiated stem cells, after first subculture, is called as stem cell line. Once under control, these stem cell lines can be stimulated to differentiate into specialized cells, a process known as directed differentiation. Based on their potential to differentiate into other types of cells, stem cells are classified into the following categories.
Totipotent: those cells which are able to differentiate into all possible cell types. Example, few cells which are obtained through initial divisions of the zygote.
Pluripotent: those cells which are able differentiate into almost all cell types. Example, embryonic stem cells which are derived from the endodermal, mesodermal, and ectodermal layers of blastocyst.
Multipotent: those cells which are able to differentiate into closely related family of cells. Example, hematopoietic stem cells that has the potential to form red/white blood cells and platelets.
Oligopotent: those cells which are able to differentiate into a few cells. Example, lymphoid and myeloid stem cells.
Unipotent: those cells which are able to produce cells of their own type, but have the property of self-renewal. Example, adult mouse stem cell.
For identification of stem cells, it is important to note that they are undifferentiated and capable of self-renewal. These two parameters are normally checked through laboratory tests for identification of stem cells. Bone marrow or hematopoietic stem cells (HSC) are tested by transplanting these cells to an individual from which HSCs are removed. The production of new blood and immune cells in that individual indicates the self renewal potency of stem cells. Colonogenic assay (a laboratory procedure) is also used to test the potency of stem cells. Routine examination of chromosomal can also be done to check whether the cells are healthy and undifferentiated. Sometime spontaneous or induced differentiation of embryonic stem cells under cell culture conditions indicates their pluripotent nature. Other tests include administration of stem cells into an immunosupressed mouse and observe it for the formation of a teratoma, which is a benign tumour containing a mixture of differentiated cells.
Applications of Stem Cells
It is important to note that every cell and tissue in the body of an individual is develop and differentiate from initial few stem cells which form during early stages of embryological development. Therefore, embryonic stem cell can be induced to differentiate into any other type of cells. Due to this regeneration potential, stem cells have been used by researchers to regenerate damaged tissues and organs under the right conditions. Usually damaged organs are replaced by healthy organs donated by someone. But the demand far exceeds the supply of organs. Particular type of tissue or organ could potentially be developed from stem cells, if directed to differentiate in a certain way. For example, stem cells that present just beneath the skin tissue have been used to regenerate new skin tissue and then grafted on to burn victims successfully.
Another potential application is replacement of cells and tissue for treatment of brain disease like Parkinson’s and Alzheimer’s. If the damaged tissue can be replenished by specialized tissue derived from stem cells such diseases can be treated for recovery.
In the near future it may be possible to transplant healthy heart cells developed in a laboratory from stem cells into the patients with heart disease, thereby repopulating the heart with healthy tissue. Similarly it may be possible to replace damaged pancreatic cells by insulin producing cells derived from stem cell, to treat type l diabetic patients.
For the treatment of diseases like leukemia, sickle cell anemia and other immunodeficiencies, adult hematopoietic stem cells found in bone marrow and blood have been used. All type of blood cells (erythrocytes as well as leukocytes) can be developed from HSC. However it is difficult to isolate hematopoietic stem cells from the bone marrow. Alternatively, hematopoietic cells are also found in the umbilical cord and placenta, from which they can be isolated easily. Realizing its potential use, umbilical cord blood banks have been established to store these powerful cells for their future use.
Therapeutic cloning or somatic cell nuclear transfer (SCNT) technique involves replacement of genetic material from a somatic cell (say from skin cell) into an unfertilized egg cell in order to develop patient specific stem cells. In this procedure, since sperms are not involved fertilization does not occur. Foetus is also not involved because the groups of cells from which the stem cells are obtained are not implanted in the uterus.
Stem cells which are developed through SCNT technique have more potential for therapeutic applications. The chances of rejection by patient’s body are less because their genetic makeup is identical to patient’s genetic makeup. Through SCNT, disease specific cell lines can be developed which are used for in-vitro studies to understand the mechanism of disease development and mode of action of certain drugs which may be used to treat these diseases.
Stem cell research is also useful for understanding development of human after formation of fertilized zygote. Undifferentiated stem cells eventually differentiate partly because of turning on or off of particular gene(s). Thus research on stem cell may help to clarify the role of specific genes that play in determining how specialized cells and tissues are formed.
Stem cell research is also being pursued to develop new drugs. Healthy human tissues which are developed through stem cells can be used to evaluate the effect of new drug rather than using human volunteers.
Table.11.1. Segmentation gene loci in Drosophila
“Gap” genes
“Pair-role” genes
“Segment polarity” genes
Ever skipped
Fushi tarazu
Cubitis interruptus
Odd paired
Odd skipped
Sloppy paired
Figure Captions
Fig. 11.1. Early stages of embryonic development in Drosophila. A cascade of gene activation sets up theDrosophilabody plan. Thematernal-effect genes, named as bicoid and nanos, are active during oogenesis. The products of these genes are found in the egg at the time of fertilization, and form morphogen gradients. These proteins function as transcription factors that regulate the expression of gap genes. The gap genes are responsible for the differentiation of anterior-posterior axis on embryo along its length. Proteins which are encoded by gap genes also function as transcription factors and regulate the expression of the pair-rule genes. Thepair-rule genesare responsible for differentiation of pairs of segments on embryo. Transcription factors which are encoded by pair-rule genes regulate the expression of thesegment polarity genes. The expression of segment polarity genes leads the development of anterior/posterior axis of each segment. The gap genes, pair-rule genes, and segment polarity genes are collectively involved in segment patterning hence they are known as segmentation genes.
Fig. 11.2. The hierarchy of genes involved in establishing the segmented body plan in Drosophila. Gene products from the maternal genes regulate the expression of the first three groups of zygotic genes (gap, pair-rule, and segment polarity, collectively called the segmentation genes), which in turn control the expression of the homeotic genes.
Fig. 11.3. Progressive restriction of cell fate during development in Drosophila.
Fig. 11.4. Overlapping of regions containing two different gene products can generate new patterns of gene expression. Transcription factors A and B are present in overlapping region 3, of expression. If both the transcription factors must bind to the promoter of a target gene to trigger expression, the gene will be active only in cells containing both factors (most likely in the zone of overlap). There shall be no transcription in individually in the region 1 and 2.
Fig. 11.5. Cell arrangement in the floral meristem. (a) The four concentric rings, or whorls, labeled 1-4, influenced by genes A, B, and C in the manner shown, give rise to the sepals, petals, stamens and carpels, respectively, (b) The arrangement of these organs in the mature flower.
Fig. 11.6. A truncated cell lineage chart for C. elegans, showing early divisions and the tissues and organs that eventually result. Each vertical line represents a cell division, and horizontal lines connect the two cells produced.
Fig. 11.7. An adult Caenorhabditis elegans hermaphrodite.
Fig. 11.8. Cell lineage determination in C. elegans vulva formation.
Fig. 11.9. Components of the Notch signalling pathway in Drosophila.