IDENTIFICATION OF SIGNIFICANT PATHWAYS RESPONSIBLE FOR AUTISM THROUGH MOLECULAR NETWORK ANALYSISHTML Full Text
IDENTIFICATION OF SIGNIFICANT PATHWAYS RESPONSIBLE FOR AUTISM THROUGH MOLECULAR NETWORK ANALYSIS
YRSN Tejaswini *1, 2, Y. Anurupa Devi 3, Yog Raj Ahuja 1, Udayshanker Araga 5 and Q. Hasan 1, 4
Vasavi Medical and Research Centre 1, Khairatabad, Hyderabad, Andhra Pradesh, India
Jawaharlal Nehru Technological University 2, Hyderabad, Andhra Pradesh, India
School of Health sciences, Karunya University 3, Hyderabad, Andhra Pradesh, India
Kamineni Hospitals 4, LB Nagar, Hyderabad, Andhra Pradesh, India
St. Martins Engineering College 5, Hyderabad, Andhra Pradesh, India
Autism is a spectrum of developmental disorders characterized by impairments in social interaction, communication, often accompanied by stereotypical or repetitive behaviours. There are numerous hypotheses holding to the etiology and pathology of Autism but actual mechanism of the ailment is still unknown. Although a number of rare mutations and dosage abnormalities are specific to autism, these explain no more than 10% of all cases making the problem more complex. In this regard, shift from a narrow focus on individual candidate genes towards a broader view of affected protein networks and associated biological pathways have achieved the significant role. We have used network biology approach to identify important molecules and pathways which play significant role in autism through molecular interaction map (MIP) containing 248 nodes linked with 892 edges. Our studies elucidate the relationship between topological properties of MIP and the role played by molecules in biological systems. Further by applying the graph theory hub proteins were obtained and the pathways in which they are involved were analyzed. Our results showed link between many signalling pathways forming 22% in combination with other pathways like adherens junction. These insights provide useful clues in understanding how and to what extent each pathway is contributing in pathophysiology of this heterogeneous disorder
Autism, Molecular network, Significant Pathways, Graph theory
INTRODUCTION: Autism is a developmental disorder characterized by impairments in social interaction and communication, often accompanied by stereotypical or repetitive behaviour 1. The condition manifests within the first 3 years and has a prevalence rate of 60-70 per 10,000 children in broader diagnostic criteria as per the most recent estimates 2. Although a number of rare mutations and dosage abnormalities are specific to autism, these explain no more than 10% of all cases 3.
There are numerous hypotheses holding to the etiology and pathology of Autism but actual mechanism of the ailment is still unknown. Since, Pharmacologic research that targeted interfering symptom domains associated with autism, has showed low or medium efficacy, the increasing rates of this disorder every year 4 demand to unearth the alternate ways in mining the clues that would finally provide definite molecular targets.
While the involvement of single gene mutations in individual autism cases cannot be excluded, the concept of a complex genetic model with multiple genes contributing to disease susceptibility remains highly plausible 5. In this regard the shift from a narrow focus on individual candidate genes towards a broader view of affected gene networks and associated biological pathways has achieved the significant role.
The Biological networks are being used as a means to decipher the important key controllers inside the complex networks. These essential nodes/hubs may serve as candidates of drug targets for developing novel therapy for various diseases. In this approach, the diseases can be seen as emergent from a complex network of underlying molecular activity influenced by genes and environment. Indeed, complex networks are a natural way of representing any data with complicated dependency relationships.
Methodology: The genes responsible for autism were collected from GeneCards database (www.genecards. org) which provides a comprehensive, authoritative compendium of annotative information on responsible genes for a particular disease based on text mining algorithm . A specific set of 60 genes which come above the score limit (score>0.3) belonging to various Categories like protein-coding, RNA gene etc were selected. Molecular interaction map was built up by using Cytoscape 7.
Mimi plugin 8 based on Query genes + nearest neighbor algorithm. The obtained interactions are cross validated using string Database which correctly uncovers and annotates all functional interactions 9. ClusterONE clustering with Overlapping Neighborhood Expansion), the graph clustering algorithm is used for detecting protein complexes in protein-protein interaction networks with associated confidence values. The generated modules are then refined based on density (> = 0.7) and P-value (< = 0.00). As a further process of refinement cytohubba is used to obtain best 10 biomolecules based on the topology-based scoring methods which include Degree 10, BottleNeck 11 Edge percolated component 12 MNC, DMNC and The double screening scheme. The biomolecules which were commonly offered by all algorithms are extracted to identify their pathways to generate a relationship of various pathways in the disease mechanism.
RESULTS AND DISCUSSION: Autism as a hetero-geneous syndrome is characterized by impairments in three core domains: social interaction, language and range of interests. Several studies have led to the identification of quite a lot of autism susceptibility genes and there has been an increased appreciation for the contribution of de novo and inherited copy number variation understanding the pathology of the disease. The idea suggesting the involvement of many Genes, associated with Common Pathways was first put forward by 13.
Although the three decades of research on autism involving twin and family studies sustain a significant genetic contribution to its pathology, it does not necessarily demand a particular model of genetic transmission or involvement of a particular gene causing the disorder. On the other hand, the last decade of research has witnessed a significant genetic heterogeneity which served as foundation for “many genes common pathway theory”.
Promising strategies such as Systems biology approaches are poised to provide additional insights in such kind of diseases in which heterogeneity, both genetic and phenotypic, is emerging as a dominant theme. One such approach is to model the virtual biological network by considering all susceptible genes as nodes and connected to each other through edges. Sometimes as neighbourhood gene interactions with the susceptible genes also attain much importance in determining the diseased state it is better to take those genes that have direct interaction with the susceptible ones while building up the network.
In this case, it is believed that mutations in the causative gene alter its interactions with neighbouring genes that are required to perform specific biological functions, and the effects of these alterations become compounded when binding partners downstream of the causative agent are also mutated or otherwise dysfunctional. After overall network analysis of molecular interaction map as briefed in methodology proteins along with their pathways that have been implicated in Autism were obtained (Table 5). All of these proteins are involved in neurodevelopment and many have roles in synaptic function. These proteins can be schematically divided into at least eight distinct ensembles depending on their involvement in;
- Signalling pathways,
- Actin cytoskeleton dynamics,
- Response to hypoxia,
- Focal adhesion or;
- Regulation of transcription.
Cell-matrix adhesions play essential roles in important biological processes including cell motility, cell proliferation, cell differentiation, regulation of gene expression and cell survival. At the cell-extracellular matrix contact points, specialized structures are formed and termed focal adhesions, where bundles of actin filaments are anchored to transmembrane receptors of the integrin family through a multi-molecular complex of junctional plaque proteins. Some of the constituents of focal adhesions participate in the structural link between membrane receptors and the actin cytoskeleton, while others are signalling molecules, including different protein kinases and phosphatases, their substrates, and various adapter proteins.
Integrin signaling is dependent upon the non-receptor tyrosine kinase activities of the FAK and src proteins as well as the adaptor protein functions of FAK, src and Shc to initiate downstream signaling events. These signalling events culminate in reorganization of the actin cytoskeleton; a prerequisite for changes in cell shape and motility, and gene expression. Similar morphological alterations and modulation of gene expression are initiated by the binding of growth factors to their respective receptors, emphasizing the considerable crosstalk between adhesion- and growth factor-mediated signalling.
The core areas affected in autism involve rapid and coherent integration of information from multiple, higher-level association areas 14. Accordingly, the predominant genetic model supposes the presence of multigenic inheritance of common polymorphisms contributing to autism risk in multiplex families 13 leading to disruption of normal function. Such functions could be easily perturbed by minor, but relatively widespread disruptions in a set of pathways. To mine these details we have constructed a Molecular Interaction Map for the selected 60 nodes (refined by score) by taking into account of its neighboring interaction molecules through MiMI plug-in of Cytoscape.
Totally 248 molecules as nodes and 892 edges as interactions were obtained. The molecular interaction map can be explained as a mathematical graph, permitting analysis with graph theoretical algorithms. Molecules like genes, proteins, transcriptional factors are denoted as nodes in the graph and interactions between them are called as edges. This MIP is a scale free network which obeys power law distribution of connectivity (figure 1).
FIGURE 1: MOLECULAR INTERACTION MAP CONTAINING QUERY GENES AND THEIR INTERACTIONS
Network Analysis: We have represented Molecular Interaction graph as an undirected graph M (N, E), consisting of set of nodes as N and set of edges as E. The size of the graph is given by the number of its nodes. The degree of its nodes indicates the number of interactions to a single node with the other nodes. The network obtained by MiMI was analyzed to know the wide range of pathways which directly or indirectly play role in autism. The analysis of all the genes obtained through MIMi gave almost 57 varied pathways out of which some of the genes were continuously repeating for Neurodegenerative Diseases, protein kinase cascade etc but majority of molecules in the interaction map were showing a high frequency of focal adhesion, signaling, Melanogenesis and cell cycle pathways.
Figure 2 shows the important pathways present in the molecular interaction map on x- axis and on Y-axis is the frequency of these pathways occurring for the respective Biomolecules. Figure 3 and 4 represent the two highly connected modules that were obtained after refined by merging the cohesive subgroups from the molecular interaction map based on the Quality of the cluster, measured by the in-weight divided by the sum of the in-weight and the out-weight. The rationale behind this measure is that a good cluster contains many heavyweight edges within the cluster itself, and it is connected to the rest of the network only by a few lightweight edges. P- Value showing the validity of the cluster.
FIGURE 2: MAJOR PATHWAYS PRESENT IN 248 NODES OF THE MOLECULAR INTERACTION MAP
The NetworkAnalyzer plug-in 15 is used to calculate the topological properties of each module individually that were tabulated in Table 1. Figure 3 and Figure 4 represent the two modules formed from molecular interaction network using clusterONE plug-in of cytoscape.
FIGURE 3: MODULE 1(34 NODES) OBTAINED FROM MOLECULAR INTERACTION NETWORK USING CLUSTERONE PLUG-IN OF CYTOSCAPE.
FIGURE 4: MODULE 2(30 NODES) OBTAINED FROM MOLECULAR INTERACTION NETWORK USING CLUSTERONE PLUG-IN OF CYTOSCAPE
Once the two modules are selected the topological parameters of each node in the modules are calculated to identify relative importance of nodes based on graph theory by considering each module as undirected network. Table 2 and 3 represent the topolocial parameters of nodes present in Module1 and 2.
TABLE 1: THE TOPOLOGICAL PROPERTIES OF EACH MODULE OBTAINED THROUGH NETWORKANALYZER PLUG-IN
|Charecterisic Path Length||1.748||1.795|
|Avg No: Of Neighbours||9.697||8.8|
|No: Of Nodes||34||30|
TABLE 2: THE TOPOLOGICAL PARAMETERS OF INDIVIDUAL NODES IN MODULE 1
|Name||Topological coefficient||Closeness||Neighbourhood connectivity||Cluster coefficient||Degree||Radiality||Stress|
TABLE 3: THE TOPOLOGICAL PARAMETERS OF INDIVIDUAL NODES IN MODULE 2
|Name||Topological coefficient||Closeness||Neighbourhood connectivity||Clustering coefficient||Degree||Radiality||Sterss|
In order to visually depict the relationship between the topological parameters Regression graph has been generated. In Figure 5, first graph represents the linear relationship between closeness (X-axis) and Degree (Y- axis) and second graph represents the linear relationship between Cluster Coefficient (X-axis) and Topological coefficient (Y- axis) for the Module1.
Likewise, in Figure 6 first graph represents the linear relationship between closeness (X-axis) and Degree (Y- axis) and second graph represents the linear relationship between Cluster Closeness (X-axis) and Betweeness (Y- axis) for the Module2. The R² value obtained confirms the uniformity in the Result.
FIGURE 5: LINEAR RELATIONSHIP BETWEEN THE TOPOLOGICAL PARAMETERS FOR MODULE1
FIGURE 6: LINEAR RELATIONSHIP BETWEEN THE TOPOLOGICAL PARAMETERS FOR MODULE2
Further, the two was analysed using the cytoHubba plugin (http:// hub.iis.sinica.edu.tw/cytoHubba/) to explore the key regulatory nodes in the network. The top 10 hubs (i.e. highly connected nodes) were identified by using the all the algorithms (Degree, EPC, DMNC, BottleNeck and DSS), displayed in Table 4 and 5. The proteins that were identified commonly by at least 5 algorithms are selected to identify the major pathways in which they are involved.
Finally the hub proteins that were selected commonly by at least 5 algorithms in both the modules are extracted and their pathways are identified (Table 6). Most of the hub proteins obtained are involved in signaling, while the rest of the proteins are involved in Focal adhesion, Glioma, and Regulation of transcription.
TABLE 4: THE TOP 10 HUB PROTEINS IDENTIFIED BY USING THE ALL THE ALGORITHMS OF CYTOHUBBA PLUG-IN FOR MODULE 1
|Algorithm||Top 10 Hub Proteins|
|DSS||CBL, GAB1, GRB2, PDGFRB, PIK3R1, PLCG1, PTK2, PTPN11, PXN, SHC1|
|Degree||CBL, EGFR, GRB2, MET, PIK3R1, PLCG1, PTK2, PTPN11, SHC1, SRC|
|BottleNeck||EGFR, GRB2, MET, PIK3R1, PLCG1, PTK2, PTPN11, SRC, TLN1, VCL|
|EPC||CBL, EGFR, GRB2, MET, PDGFRB, PIK3R1, PLCG1, PTPN11, SHC1, SRC|
|MNC||CBL, EGFR, GRB2, MET, PIK3R1, PLCG1, PTK2, PTPN11, SHC1, SRC|
|DMNC||CBL, GAB1, GRB2, INPP5D, INPPL1, KDR, SH3KBP1, SHC1, TRIP6, VAV1|
TABLE 5: THE TOP 10 HUB PROTEINS IDENTIFIED BY USING THE ALL THE ALGORITHMS OF CYTOHUBBA PLUG-IN FOR MODULE 2
|Algorithm||Top 10 Hub Proteins|
|DSS||AR, BRCA1, CCND1, CREBBP, CTNNB1, EP300, HNF4A, POLR2A, TP53, UBE2I|
|Degree||AR, BRCA1, CCND1, CREBBP, CTNNB1, EP300, ESR1, HNF4A, NR3C1, TP53|
|BottleNeck||AR, BRCA1, CALM1, CREBBP, EP300, ESR1, HNF4A, NCOR1, NR3C1, TP53|
|EPC||AR, BRCA1, CCND1, CREBBP, CTNNB1, EP300, ESR1, HNF4A, NR3C1, TP53|
|MNC||AR, BRCA1, CCND1, CREBBP, CTNNB1, EP300, ESR1, HNF4A, NR3C1, TP53|
|DMNC||AKT1, AR, BRCA1, CCND1, CREBBP, CTNNB1, SIN3A, STAT3, TFF1, TP53|
TABLE 6: PATHWAYS FOR HUB PROTEINS IDENTIFIED BY AT LEAST 5 ALGORITHMS OF CYTOHUBBA PLUG-IN
|CBL||11||ErbB signaling pathway ; Jak-STAT signaling pathway|
|SHC1||1||ErbB signaling pathway ; Focal adhesion|
|GRB2||17||Focal adhesion; Gap junction; Jak-STAT signaling pathway; Glioma|
|PLCG1||20||ErbB signaling pathway ; Glioma|
|PTPN11||12||Jak-STAT signaling pathway|
|PIK3R1||5||ErbB signaling pathway; Regulation of actin cytoskeleton ; Jak-STAT signaling pathway; Glioma; Melanogenisis; Focal adhesion|
|AR||X||cell-cell signaling; regulation of transcription|
|CREBBP||16||response to hypoxia; signal transduction ; regulation of transcription|
|EP300||22||cell cycle; regulation of transcription; signal transduction; regulation of transcription|
|CTNNB1||3||Wnt receptor signaling pathway|
|HNF4A||20||blood coagulation; transcription|
FIGURE 7: PERCENTAGE OF OCCURRENCE OF PATHWAYS IN HUB PROTEINS IDENTIFIED BY AT LEAST 5 ALGORITHMS OF CYTOHUBBA PLUG-IN (HIGHLY OCCURRED PATHWAYS ARE LABELED)
CONCLUSION: In an effort to identify the important pathways involved in the Autism mechanism, the set of biomolecules which are highly connected (Modules) are identified from Molecular interaction map and the topological parameters for each individual node within the modules (Module 1 and Module 2) are calculated.
Further, the best 10 nodes in each module are obtained by using all the algorithms of cytohubba and the commonly occurring hub proteins in at least 5 algorithms are extracted to identify the pathways in which they are involved. Our Analysis has revealed that ErbB signaling pathway and Jak-STAT signaling pathways are the major pathways in autism.
|How to cite this article:
Tejaswini YRSN, Devi AY, Ahuja YR, r Araga U and Hasan Q: Identification of Significant Pathways Responsible for Autism through Molecular Network Analysis. Int J Pharm Sci Res. 3(12); 4989-4996.
In addition, Focal adhesion, Glioma, and Regulation of transcription are also involved in the pathology of the disease.
- Ka-Yuet Liu, Marissa King, and Peter S. Bearman Social Influence and the Autism Epidemic AJS. 2010 March; 115(5): 1387–1434.
- Fombonne E, Quirke S, Hagen A (2009) Prevalence and interpretation of recent trends in rates of pervasive developmental disorders. Mcgill J Med 12: 73.
- Turner T, Pihur V, Chakravarti A (2011) Quantifying and Modeling Birth Order Effects in Autism. PLoS ONE 6(10): e26418.
- Kao HT, Buka SL, Kelsey KT, Gruber DF, Porton B (2010) The Correlation between Rates of Cancer and Autism: An Exploratory Ecological Investigation. PLoS ONE 5(2): e9372.
- Zhao X, Leotta A, Kustanovich V, LaJonchere C, Geschwind DH, et al. (2007) A unified genetic theory for sporadic and inherited autism. Proceedings of the National Academy of Sciences of the United States of America 104: 12831–12836.
- Safran M, Dalah I, Alexander J, Rosen N, Iny Stein T, Shmoish M, Nativ N, Bahir I, Doniger T, Krug H, Sirota-Madi A, Olender T, Golan Y, Stelzer G, Harel A and Lancet D. GeneCards Version 3: the human gene integrator Database 2010.
- Paul Shannon, Andrew Markiel,Owen Ozier, Nitin S. Baliga,Jonathan T. Wang, Daniel Ramage, Nada Amin, Benno Schwikowski, and Trey Ideker Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks Genome Res. 2003 13: 2498-2504.
- Magesh Jayapandian, Adriane Chapman, V. Glenn Tarcea, Cong Yu, Aaron Elkiss, Angela Ianni, Bin Liu, Arnab Nandi, Carlos Santos, Philip Andrews, Brian Athey, David States and H. V. Jagadish Michigan Molecular Interactions (MiMI): putting the jigsaw puzzle together Nucleic Acids Research, 2007, Vol. 35, Database issue.
- Szklarczyk, D. Franceschini, A. Kuhn, M. von Mering, C. et al., The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res. 2011 January; 39(Database issue): D561–D568.
- Jeong,H., Mason,S.P., Baraba´ si,A.L. and Oltvai,Z.N. (2001) Lethality and centrality in protein networks. Nature, 411, 41–42. 24.
- Yu,H., Kim,P.M., Sprecher,E., Trifonov,V. and Gerstein,M. (2007)The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput. Biol., 3, e59.
- Chin,C.-S. and Manoj,P.S. (2003) Global snapshot of a protein interaction network—a percolation based approach. Bioinformatics, 19, 2413–2419.
- Abrahams BS, Geschwind DH. Advances in autism genetics: on the threshold of a new neurobiology. Nat Rev Genet. 2008 May;9(5):341-55.
- Geschwind DH, Levitt P. Autism spectrum disorders: developmental disconnection syndromes. Curr Opin Neurobiology. 2007 Feb;17(1):103-11.
- Assenov, Y., Ramirez, F., Schelhorn, S.E., Lengauer, T., Albrecht, M. Computing topological parameters of biological networks. Bioinformatics, 24(2):282-284, 2008.
How to cite this article:
Tejaswini YRSN, Devi AY, Ahuja YR, r Araga U and Hasan Q: Identification of Significant Pathways Responsible for Autism through Molecular Network Analysis. Int J Pharm Sci Res. 3(12); 4989-4996
YRSN Tejaswini *, Y. Anurupa Devi , Yog Raj Ahuja , Udayshanker Araga and Q. Hasan
Vasavi Medical and Research Centre 1, Khairatabad, Hyderabad, Andhra Pradesh, India
03 July, 2012
29 October, 2012
29 November, 2012
01 December, 2012