Elsevier

Methods

Volume 69, Issue 3, 1 October 2014, Pages 247-256
Methods

DiffNet: Automatic differential functional summarization of dE-MAP networks

https://doi.org/10.1016/j.ymeth.2014.06.012Get rights and content

Highlights

  • A problem model for summarizing differential networks is described.

  • Summary finds key functional modules that respond to condition change.

  • A solution that solves the problem model is proposed (DiffNet).

  • We report the functional responses of the yeast network after MMS treatment.

Abstract

The study of genetic interaction networks that respond to changing conditions is an emerging research problem. Recently, Bandyopadhyay et al. (2010) proposed a technique to construct a differential network (dE-MAP network) from two static gene interaction networks in order to map the interaction differences between them under environment or condition change (e.g., DNA-damaging agent). This differential network is then manually analyzed to conclude that DNA repair is differentially effected by the condition change. Unfortunately, manual construction of differential functional summary from a dE-MAP network that summarizes all pertinent functional responses is time-consuming, laborious and error-prone, impeding large-scale analysis on it. To this end, we propose DiffNet, a novel data-driven algorithm that leverages Gene Ontology (go) annotations to automatically summarize a dE-MAP network to obtain a high-level map of functional responses due to condition change.

We tested DiffNet on the dynamic interaction networks following MMS treatment and demonstrated the superiority of our approach in generating differential functional summaries compared to state-of-the-art graph clustering methods. We studied the effects of parameters in DiffNet in controlling the quality of the summary. We also performed a case study that illustrates its utility.

Introduction

High-throughput mapping of genetic interaction networks of a set of genes is an important and emergent research problem [5]. The networks constructed with these methods, however, only represent a static “snapshot” of the genetic interaction map under a particular context or condition. Recent studies have shown that genetic interaction maps are in fact dynamic and context-dependent [18]. Consequently, there is a growing interest in studying the system-wide responses of interaction networks following environmental or condition change [15], [10]. For instance, one may be interested in elucidating the genetic interaction differences between cancer cells and normal cells. Specifically, some interactions may appear or disappear in the disease state, intensity of some interactions may alleviate or aggravate when in disease state compared to healthy condition, and others may remain strong irrespective of the state.

One representative method that has been recently proposed for mapping the genetic interaction responses following environment change is the dE-MAP approach [2]. In this method, two static gene interaction networks [5] for each condition are first obtained using the epistatic miniarray profile (E-MAP) approach [17], which constructs a quantitative genetic interaction landscape of Saccharomyces cerevisiae by first identifying a set of genes of interest. Double mutant strains of all pairwise genes from this set of genes are then grown and their colony size measured. Genetic interaction occurs between a pair of mutant genes when one observes greater or lesser than expected colony growth rate when compared to their respective single mutant strains. When the growth rate is greater than expected, the interaction is deemed positive (alleviating); when it is lesser, it is deemed negative (aggravating). Using the two static E-MAP networks, a differential network (dE-MAP network) is then computed that maps the interaction differences between the two static networks. For example, in [2], S. cerevisiae E-MAP networks are obtained for cells grown under two conditions: (a) cells which are treated with methyl methanesulfonate (MMS), a well known DNA-damaging agent and (b) cells which are untreated. Large-scale genetic interaction network among 418 yeast genes are quantitatively extracted using the E-MAP method under the MMS-treated condition (stressed) and untreated condition (unstressed) and the differential network that maps the genetic interaction changes due to MMS challenge is computed. Fig. 1 depicts an example of a differential network (partial view) that is obtained from two static E-MAP networks under MMS-treated and untreated condition.

Naturally, it is important to analyze this differential network to investigate the system-wide impact of the DNA-damaging agent on the functional roles of various components. Consequently, the authors obtained physical protein–protein interactions corresponding to these genes and performed graph clustering to find protein complexes1 enriched with differential interactions. The functional identity of each cluster is then manually2 determined. Particularly, the authors concluded that these complexes tend to be stable across conditions and differential interactions largely lie between complexes, rather than within complexes. Unfortunately, modules constructed in this manner poorly represent the functional responses of the differential network. Hence, to find a functional response, the authors manually selected a subset of 31 genes associated with DNA repair to test for differential interaction enrichment, concluding that DNA repair is a pertinent functional response following MMS-treatment. However, it is time-consuming, laborious and error-prone to perform large-scale analysis of dE-MAP interactomes to map all pertinent functional responses. In this paper, we propose a novel technique called DiffNet that addresses this impediment by automatically constructing a high quality differential summary of two E-MAP networks under environmental change. Fig. 2 highlights some of these functional modules that are differentially effected by the DNA-damaging agent.

At first glance, the aforementioned failure of traditional graph clustering techniques to capture differential summaries in its modules may seem surprising. However, as we shall see in Section 4, these techniques are largely designed for static networks and are less suitable for differential networks that contain both positive and negative weights. Furthermore, since most methods rely solely on topology of the network, there is also no guarantee that each cluster corresponds well to a representative biological function response. In fact, as remarked earlier, in [2] the functional identity of each cluster following graph clustering is manually determined. Furthermore, the authors failed to assign function to a significant number of these clusters.

In fact, algorithms that perform genome-wide functional analysis of gene responses under multiple conditions have been proposed in the literature [19], [20], [9]. Particularly, these approaches perform functional analysis based on the expression levels of genes. In contrast, in our problem we focus on genome-wide functional analysis of the gene interactions and their responses.

Given the differential network generated from dE-MAP interactions, DiffNet greedily constructs a differential summary comprising of a set of skewed and coherent functional subgraphs, representing significant functional responses following environment or condition change. Specifically, it leverages Gene Ontology (go) annotations to identify these functional subgraphs, each of which represents a group of interactions corresponding to a specific biological function. A key characteristic of these functional subgraphs is that the interactions together respond significantly in one direction, either positively or negatively, to the condition change. That is, unlike standard graph clustering methods, DiffNet is specifically designed to handle differential interactions, which can be positively or negatively weighted. Fig. 3 illustrates the idea of the DiffNet algorithm. We shall elaborate on it in the next section.

Section snippets

Summary of proposed method

DiffNet is a novel data-driven algorithm that automatically summarize a dE-MAP network to obtain a high-level map of functional responses due to condition change.

  • Input: A dE-MAP network.

  • Output: A high-level summary of functional responses (both positive and negative responses) due to condition change.

  • Tools used in the proposed method: Scala.

  • Databases, if any, used in the proposed method: Gene Ontology Annotations dataset (goa).

Constructing differential networks

The set of genes of interest together with their genetic interactions can be modeled as a gene–gene interaction network, denoted by G=(V,E,w), where V is a set of genes selected for E-MAP study, E denotes the pairwise interactions between genes, and w is a function that assigns each pairwise interaction eE a weight that represents its interaction strength. In E-MAP studies, w(e) of eE is given by its genetic interaction score S-score [17]. A positive S-score indicates the degree of

Results

The DiffNet algorithm is implemented in Scala. We now present experimental results of the performance of DiffNet. The experiments were conducted on a 1.66 GHz Intel Core 2 Duo T5450 machine with 3 GB memory. Unless specified otherwise, we set k=45 and α=5.0.

Conclusions

We propose DiffNet, a novel data-driven algorithm that automatically constructs summaries of differential functional responses of gene interaction networks under environment or condition change. Specifically, it leverages combination of go annotation information and underlying interaction data to greedily identify a set of functional subgraphs that are highly skewed and coherent, representing significant functional responses due to condition change. Our empirical study with a real-world network

References (21)

  • B.E. Kremer et al.

    Cell

    (2007)
  • M. Schuldiner et al.

    Methods

    (2006)
  • G.D. Bader et al.

    BMC Bioinformatics

    (2003)
  • S. Bandyopadhyay et al.

    Science

    (2010)
  • E.I. Boyle et al.

    Bioinformatics

    (2004)
  • V. Chvatal

    Math. Oper. Res.

    (1979)
  • S.R. Collins et al.

    Genome Biol.

    (2006)
  • A.J. Enright et al.

    Nucleic Acids Res.

    (2002)
  • P. Fabrizio et al.

    Science

    (2001)
  • B.J. Frey et al.

    Science

    (2007)
There are more references available in the full text version of this article.

Cited by (6)

  • Distributed aggregation-based attributed graph summarization for summary-based approximate attributed graph queries

    2021, Expert Systems with Applications
    Citation Excerpt :

    For examples, graph summarization in social networks analysis is to investigate the scale-free property of graphs, hop-plots for studying the small world effect, or clustering coefficients for measuring the clumpiness of large graphs (Ravi & Kumar, 2005). It can be also used to analyze the functional relationships in the Alzheimer’s disease network (Seah, Bhowmick, & Dewey, 2014; Seah, Bhowmick, Dewey, & Yu, 2012). Graph summarization can also be naturally coupled with graph visualization methods (Shen et al., 2006; Herman et al., 2000; Wills, 1999) for understanding and visual inspection of large graph data.

  • Meta-path Enhanced Lightweight Graph Neural Network for Social Recommendation

    2022, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
  • Aggregation-Based Attributed Graph Summarization

    2020, 2020 IEEE 5th International Conference on Cloud Computing and Big Data Analytics, ICCCBDA 2020
  • Summarizing static and dynamic big graphs

    2017, Proceedings of the VLDB Endowment
  • Clustering and Summarizing Protein-Protein Interaction Networks: A Survey

    2016, IEEE Transactions on Knowledge and Data Engineering
View full text