Inferring synthetic lethal interactions from mutual exclusivity of genetic events in cancer

Background Synthetic lethality (SL) refers to the genetic interaction between two or more genes where only their co-alteration (e.g. by mutations, amplifications or deletions) results in cell death. In recent years, SL has emerged as an attractive therapeutic strategy against cancer: by targeting the SL partners of altered genes in cancer cells, these cells can be selectively killed while sparing the normal cells. Consequently, a number of studies have attempted prediction of SL interactions in human, a majority by extrapolating SL interactions inferred through large-scale screens in model organisms. However, these predicted SL interactions either do not hold in human cells or do not include genes that are (frequently) altered in human cancers, and are therefore not attractive in the context of cancer therapy. Results Here, we develop a computational approach to infer SL interactions directly from frequently altered genes in human cancers. It is based on the observation that pairs of genes that are altered in a (significantly) mutually exclusive manner in cancers are likely to constitute lethal combinations. Using genomic copy-number and gene-expression data from four cancers, breast, prostate, ovarian and uterine (total 3980 samples) from The Cancer Genome Atlas, we identify 718 genes that are frequently amplified or upregulated, and are likely to be synthetic lethal with six key DNA-damage response (DDR) genes in these cancers. By comparing with published data on gene essentiality (~16000 genes) from ten DDR-deficient cancer cell lines, we show that our identified genes are enriched among the top quartile of essential genes in these cell lines, implying that our inferred genes are highly likely to be (synthetic) lethal upon knockdown in these cell lines. Among the inferred targets are tousled-like kinase 2 (TLK2) and the deubiquitinating enzyme ubiquitin-specific-processing protease 7 (USP7) whose overexpression correlates with poor survival in cancers. Conclusion Mutual exclusivity between frequently occurring genetic events identifies synthetic lethal combinations in cancers. These identified genes are essential in cell lines, and are potential candidates for targeted cancer therapy. Availability: http://bioinformatics.org.au/tools-data/underMutExSL Reviewers This article was reviewed by Dr Michael Galperin, Dr Sebastian Maurer-Stroh and Professor Sanghyuk Lee. Electronic supplementary material The online version of this article (doi:10.1186/s13062-015-0086-1) contains supplementary material, which is available to authorized users.

Reviewers: This article was reviewed by Dr Michael Galperin, Dr Sebastian Maurer-Stroh and Professor Sanghyuk Lee.

Background
Cells have evolved to ensure their viability.Although typically associated with survival (i.e. the ability to maintain homeostasis but not necessarily cell division), cell viability can be defined more broadly to encompass the ability to grow and proliferate.Processes within the cell ensure that it is sufficiently protected against deleterious genetic events -e.g.mutations, amplifications and deletions -that impact cell viability, but when these events are unavoidable the cell commits to apoptosis or programmed cell death.
Genetic events can modify this control on cell viability, resulting in viability being enhanced (e.g. in cancer) or compromised (e.g. during cell senescence and death).This is effected by the (over-)activation or inactivation of genes responsible for cell viability through gain-offunction or loss-of-function genetic events, respectively.
When two or more of these genetic events occur simultaneously, these can considerably impact the viability of cells.Synthetic lethality (SL), first defined by Bridges in 1922 [1], refers to one such combination between two genetic events (typically affecting two different genes) in which their co-occurrence results in severe loss of viability or death of the cell, although the cell remains viable when only one of the events occurs [2,3].
SL has gained considerable attention over the last few years due to its value in understanding the essentiality of genes or their combinations [4,5], and more recently due to its promise as a therapeutic strategy for selective targeting of cancer cells [6,7].Cancer cells are genetically different from normal cells and harbour genetic events in specific genes that enhance their viability.Therefore, by identifying and targeting (i.e.inducing a genetic event in) the synthetic-lethal partner of these genes, selective killing of cancer cells can be achieved while sparing the normal cells.SL-based therapies exploit these genetic differences in a way that is often not possible with conventional chemotherapy, which is often cytotoxic to normal as well as cancer cells [8].
A pioneering breakthrough in SL-based cancer therapy showed that inhibition of poly(ADPribose) polymerase (PARP) in cancer cells that harbour loss-of-function events in the breastcancer susceptibility genes BRCA1 and BRCA2 is dramatically lethal to these cells [9,10] (reviewed in [11]).Germline losses in BRCA1/BRCA2 are highly penetrant, conferring 60-80% risk of breast and 30-40% risk of ovarian cancers.These losses account for about 10-25% of hereditary breast and ovarian cancers [11,12].
Following the promise of BRCA-PARP, several studies have explored (computational) identification of SL interactions that could be efficacious in treating cancer.This began with seminal [13][14][15] and follow-up works [16][17][18] that studied "cross-talk" between pathways in model organisms including yeast, worm and fruit fly to characterise genetic interactions.
From these studies emerged a between-pathway model [13,14] according to which loss of function in only one pathway does not greatly affect cell viability, but the further inactivation of a second parallel or compensatory pathway results in cell death.This model characterised synthetic lethal interactions as genetic interactions between these compensatory pathways.
More-recent studies [19][20][21] have attempted extrapolation of SL interactions from model organisms (e.g.yeast http://drygin.ccbr.utoronto.ca/[22] using protein-sequence homology to infer interactions in human cells -e.g.BRCA2-RAD52 [23], SMARCB1-PSMA4, ASPSCR1-PSMC2 [19] and between FEN1 and SMC3, RNF20, BLM, MRE11A, STAG3, CDC4 and CHTF18 [20,21].Classification-based approaches [24][25][26][27] that employ a support vector machine trained with features from model organisms have also been used to predict new SL interactions in human, with the expectation that SL interactions follow similar organisational principles in human and model organisms.Recently Zhang et al. [28] proposed that single-and double-knockdown of proteins within known pathways could be computationally simulated to estimate interactions that are lethal; AKT with BID, CASP9 and WEE1 were among the top SL interactions identified in human.Others [29][30][31][32][33] have employed combined experimental and computational approaches by performing knockdown of combinatorial pairs of genes using large-scale siRNA-screens across cell lines and in vivo models (e.g.http://www.genomernai.org/[30]).These approaches have been more successful than the solely computational ones, resulting in identification of actionable SL-based targets, including GATA2 and CDC6 as SL partners of KRAS [29].However, these approaches are considerably more expensive, and many of the essential genes so identified turn out to be either restricted to only these cell-line models or are infrequently overexpressed in cancers.
Despite these attempts, interactions extrapolated from lower-order model organisms fail to hold up in human cells and are less-appealing in the context of cancer therapy.This is because the model systems, despite sharing some homologous proteins with human, have considerably different and simpler cellular and functional organisation [34,35].While core cellular processes including cell-cycle and DNA-damage repair are broadly conserved, human cells express novel proteins, isoforms and/or paralogs with partially overlapping functions that buffer the loss of one another [34][35][36][37], with the consequence that lethality inducible by targeting only one of them is not conserved.For example, human cells have three AKTs -AKT1, AKT2 and AKT3 -with partially overlapping functions [38] whereas yeast has only one AKT.Similarly, SL interactions predicted from 'static' pathway maps do not reflect the actual scenario in cancer cells; these cells undergo significant pathway rewiring to enhance their viability [39,40].Finally, these SL interactions do not include cancer genes or genes that are frequently altered in cancers -in particular, from the examples above, ASPSCR1, BLM, SMARCB1 and MRE11A put together are altered (homozygous deletion) in < 10% of most cancers as per The Cancer Genome Atlas (TCGA) Cbioportal cohort [41,42] -and therefore, the proportion of cancers benefiting from targeting their SL partners is very small.
Here, we develop a computational approach taking into account the above factors by directly inferring SL interactions from frequently altered genes in cancers.We show that specific combinations of genes that display mutual exclusivity for genetic events are likely to constitute lethal combinations, and therefore by targeting these genes in conjunction could kill cancer cells.To demonstrate this, we consider six key DNA-damage response (DDR) genes that are frequently altered across four cancers -breast, prostate, ovarian and uterineand using genomic copy-number and gene-expression data from TCGA [41,42], we identify genes that are altered in a (significantly) mutually exclusive manner with these six DDR genes.By comparing with data from genome-wide (~16000 genes) essentiality screens across ten DDR-deficient cancer cell lines [31,32], we show that our identified genes are enriched among the top quartile of essential genes in these cell lines, implying that our inferred genes are likely to be (synthetic) lethal upon knockdown in these cell lines.

Methods
Suppose that in a given large set of viable (cancer) cells, a pair of genes exhibits mutual exclusivity with respect to a genetic event -i.e. each gene individually is affected by the genetic event in most large proportions of cells but both genes are simultaneously affected in few or none.We hypothesize that the observed viability of these cells is dependent on, or a consequence of, the mutual exclusivity between the two genes: cells are not viable if the genetic event were to affect both genes simultaneously, and therefore we observe that few if any viable cells in our population carry such an event in both genes (in other words, the mutually exclusive combinations constitute the (clonally) selected combinations amenable to cell survival).Consequently, we infer that the two genes are synthetic lethal with each other.

Mutual exclusivity between genetic events and inferring SL combinations
Suppose that we are given an arbitrarily large set of viable (cancer) cells S. Let A and B be a pair of genes affected by a genetic event E in these cells S.

Computing significant mutually exclusive gene combinations
Let X be the random variable that counts the number of cells that show co-occurrence of a genetic event for the pair of genes (A, B).We estimate the statistical significance for the mutual exclusivity between A and B based on the probability of observing at most |S AB | cells (out of |S B | cells) with co-occurrence of the event (with |S A | cells).We estimate this probability P[X ≤ |S AB |] as where P[X > |S AB |] is computed using the hypergeometric probability mass function for This "1 -hypergeometric test" p-value (Equation 1) is used to infer SL pairs (at p < 0.05), and the inferred pairs are ranked in order of their p-values.

Datasets
We gathered genomic copy-number and gene-expression datasets from four sporadic cancers, breast [43], prostate [44], ovarian [45] and uterine [46], from TCGA via Cbioportal (http://www.cbioportal.org/index.do)[41,42] and TCGA Firehose (http://gdac.broadinstitute.org/),composing a total of 3980 samples (Table 1).We consider four distinct genetic events: two kinds of genomic events (viz.gene copy-number amplifications and deletions), and two kinds of expression level events (viz.gene up-and downregulation).We expect that the changes in expression levels should encompass the effects of other kinds of events not directly considered here -e.g.mutations, chromatin changes and methylation.
These copy-number and expression events are inferred from GISTIC-normalized [47] values available via Cbioportal [41,42] and TCGA Firehose.The copy-number value for each gene reflects the deviation in its number of copies from normal and is normalized to a range of [-2, 2] where negative values represent deletions and positive values represent amplifications.We consider only high-level amplifications and deletions (typically homozygous deletions) having copy-number values +2 and -2, respectively.Likewise, the expression for each gene is z-score normalized, and here we consider genes that are highly upregulated or downregulated given by z-scores at least two standard deviations on either side of the mean (as per [41,42]).For more details on how these GISTIC-normalized values are computed, refer to [47].
To validate our predictions (genes B) we employed genome-wide (~16000 genes) essentiality data from siRNA-mediated knockdown screens across ten cancer cell lines that harbour a deficiency (mutation, deletion or downregulation) in at least one of the genes A (Table 2) [31,32].These essentiality data are in the form of GARP (Gene Activity Rank Profile) scores for each gene and are approximately in the range [+5, -10] with a lower value in a cell line indicating higher essentiality for the gene in that cell line.

Identifying mutually exclusive combinations involving frequently altered genes in cancers: a case study using six DNA-damage response genes
Here we consider mutual exclusivity with deletion and downregulation events affecting the following six genes (as genes A): ATM, BRCA1, BRCA2, CDH1, PTEN and TP53.These are tumour-suppressor genes that are central to or regulate DNA-damage response (DDR) functions, that is, genes that play important roles in maintaining the genomic integrity of the cell and control cell proliferation [11].These genes are deleted or downregulated across all the four cancers considered here, and their loss is a significant driver event in these cancers (TCGA, 2011; TCGA, 2012; TCGA, 2013; TCGA, 2014).To predict SL interactions, we identify genes B that are either amplified/upregulated or deleted/downregulated in a mutually exclusive manner to these genes A. Specifically, we identify two kinds of mutually exclusive combinations (SL interactions): (i) deletion/downregulation of gene A with amplification/upregulation of gene B; and (ii) deletion/downregulation of gene A with deletion/downregulation of gene B.

Gene A deletion or downregulation with gene B amplification or upregulation
We identified a total of 842 SL interactions involving 718 genes B at p<0.01. Figure 1a shows the distribution of these interactions with respect to the different genes A. BRCA2 dominates the number of SL interactions followed by CDH1, PTEN and TP53, whereas ATM and BRCA1 participate in very few SL interactions.While these proportions are to an extent influenced by the actual fraction of cases in which these genes A are deleted/downregulated -PTEN (34%), CDH1 (12.9%),BRCA2 (9.8%), and TP53 (9.7%) are in much higher numbers than ATM (5.7%) and BRCA1 (4.8%) across all cancers -this still indicates that overall (i.e.across the four cancers) these six genes are involved to different extents in their synthetic lethality with amplification/upregulation of genes B. Moreover, there were fewer (<5%) overlaps between genes B partnered with different genes A, indicating considerable diversity in the SL landscape (Additional file 1).However, different genes A dominate the SL interactions within the individual cancers -e.g.CDH1 (99.5%) for breast, PTEN (78.4%) and BRCA2 (16.8%) for prostate and BRCA1 (17.9%) and TP53 (16.2%) for ovarian cancers at p < 0.01, as shown in Figure 1b.These results indicate that SL interactions could be highly context-dependent (here, the type of cancer) with deletion/downregulation in different genes A dominating SL interactions within different cancers.Note that these analyses are based on uncorrected p-values (Equation 1) and are meant only to give a sense of the distribution of SL interactions; for the validation of genes B (below) we use the relative rankings of their p-values.

Gene A deletion or downregulation with gene B deletion or downregulation
The number of SL interactions involving both genes A and B deleted/downregulated were considerably fewer than in the previous case -a total of 143 interactions involving 117 genes B identified across all the four cancers at p < 0.05.As above, BRCA2, PTEN and CDH1 dominate these SL interactions (Figure 2a) when all four cancers are taken together, whereas different genes dominate within the individual cancers -e.g.CDH1 (91%) and PTEN (9%) for breast, PTEN (88%) and BRCA2 (9%) for prostate, and CDH1 (58%) and BRCA1 (38%) for ovarian cancers.

Computational validation using data from cell-line essentiality screens
We expect that targeting genes B in conjunction with genes A could induce lethality.To validate this, we analysed the GARP essentiality scores of genes B in cell lines deficient with genes A. We chose ten cell lines (Table 2) that harbour a deficiency in at least one of the genes A [31,32].The left-hand side plots of Figure 3 compare the ranges of GARP essentialities of our predicted genes B with that of the entire set of ~16000 profiled genes in these cell lines.While it is difficult to directly compare the two ranges because of the difference in number of genes in them, for the majority of cell lines the genes B at the 25 th percentile had lower GARP scores than the corresponding genes from the entire profiled set.In particular, our predicted genes B were enriched significantly (χ 2 test p<10 -5 ) with the topquartile of essential genes (approximately the top 5000) from these ~16000 genes.
We ranked our predicted genes B in increasing order of their mutual-exclusivity significance to generate a mutual-exclusivity (ME) ranking.Then, for each gene B that was among the top 5000 we assigned a gene-essentiality (GE) rank as '5000 -rank of gene B in the essentiality screen' (a reverse ranking).We then plot GE rank vs ME rank for all genes B for each cell line according to the gene-A deficiency it harbours.For example, since the cell line HCC1143 harbours a deficiency in TP53 (Table 1), we plot GE rank vs ME rank for all genes B that are predicted as mutually exclusive with TP53 using the GARP score data for HCC1143.Doing so using the amplified/upregulated genes B resulted in the right-hand side plots shown in Figure 3.For all cell lines harbouring deficiencies in ATM, BRCA1, BRCA2, PTEN and TP53 we see a downward trend, thereby indicating a strong agreement between the rankings based on mutual exclusivity and the GARP essentialities of our predicted genes B in gene Adeficient cell lines (no data are available for cell lines with CDH1 deficiency).This analysis indicates that the genes B that are mutually exclusive with the loss of A in tumours are also essential in gene A-deficient cell lines, and therefore supports our hypothesis that the observed mutual exclusivity is very likely a mechanism to avoid cell lethality (thereby enhancing the essentiality of B).As these genes B are also (frequently) amplified/upregulated in the cancers, these could be attractive as targets in cancer therapy.
Figure 4 shows similar plots using the deleted/downregulated genes B; however, since these genes are far fewer the plots show data for fewer gene ranks, the most being for PTEN.
To understand whether the lethality observed for genes B is specific to A-deficient cell lines, we analysed the differential essentiality of B in the cell lines relative to the MCF7 cell line (due to lack of suitable data on normal cell lines, we chose MCF7 which is a typical luminal line with no known DDR defect, as our control for the comparison).We observed significant difference between the mean essentialities for B between the DDR-deficient and MCF7 cell lines (Figure 5a).Similar results were observed using data [32] from two HCT116-derived isogenic cell lines, one PTEN-/-and the other with wild-type PTEN (Figure 5b).This analysis indicated that the essentiality of B was highly specific to cell lines harbouring gene A deficiency, and hence B is synthetic lethal in the context of deficiencies in DDR genes.

Case studies of identified SL partners B
We expect that our predicted genes B that are amplified/upregulated to confer poor survival, and to validate this we plot the Kaplan-Meir curves for these genes using survival data from cancer patients (KMPlotter: http://www.kmplot.com/)[48].Figure 6 shows examples for breast cancer (plots for ovarian cancer in Additional file 1).
Several interesting genes are identified here -e.g.TLK2, which encodes the serine/threonine tousled-like kinase, is closely associated with the repair of DNA double-strand breaks (DSBs) and in the regulation of chromatin assembly during S-phase [49].TLK2 is amplified/upregulated in 26% of sporadic breast cancer cases, and its amplification/upregulation is mutually exclusive to BRCA2 deletion/downregulation. TLK2 overexpression correlates with significantly poor survival (p = 0.00072) in these patients.In particular, GOBO-based analysis [50] indicates that TLK2 is overexpressed in 37% luminal (estrogen receptor (ER)-positive) breast tumours, and grade 3-stratified multivariate analysis indicates a hazard ratio of 2.25 (p=10 -5 ) and poor survival (p<0.01) in patients with these tumours (Additional file 1).Interestingly, TLK2 overexpression can co-occur with PTEN loss or when PIK3CA, a key driver of ER-positive/luminal tumours, is not overexpressed.This also agrees with the high expression of TLK2 in luminal cell lines MCF7, MDA-MB-361 and SUM52PE which do not show high expression for PIK3CA (Additional file 1).Therefore, it is possible that TLK2 acts as a context-dependent driver of ER-positive/luminal tumours in the absence of PIK3CA expression.
The ubiquitin specific peptidase USP7, which is a deubiquitinating enzyme, is amplified/overexpressed in ~40% of breast tumours in TCGA.USP7 is known to deubiquitinate target proteins including TP53 and PTEN.Overexpression of USP7 correlates with poor survival specifically in TP53-mutant patients (Additional file 1).
EXOSC4 which encodes the EXOSC4 subunit of the RNA exosome complex that is important for RNA processing and degradation, is upregulated in 24% breast tumours, and confers poor survival (p=0.012).

Discussion
Our results demonstrate that there exist pairs (A, B) of genes that are altered in a mutually exclusive manner across tumours.We hypothesize that this observed mutual exclusivity could be a mechanism to avoid cell death, and consequently these pairs (A, B) constitute synthetic lethal combinations.To test our hypothesis, we use the essentialities (GARP scores) measured for genes B across cell-lines deficient in genes A (here, A includes six key DDR genes).We demonstrate that when our predicted genes B are ranked in order of their mutual exclusivity with A (as p-values), their ranks are consistent with that of their GARP scores in these A-deficient cell lines: the top-ranked genes B are also highly essential (lethal) to these cell lines.This is strongly suggestive that mutual exclusivity is an important mechanism to avoid lethality.
Our approach is novel because we infer SL interactions directly from tumour data and validate these using essentiality data from tumour cell lines, and is different from earlier approaches [24][25][26][27][28]. Therefore, we effectively bypass several of the limitations of these approaches viz.inference of infrequently altered genes as SL partners, inference of genes solely from cell line models that may not hold in tumours, and inference of SL interactions from model organisms that do not hold in human [35].
Beginning from the between-pathway model [13,14], synthetic lethality (SL) has often been associated with compensatory or parallel pathways, such that the loss of function of one of the pathways does not significantly affect cell viability whereas the loss of both pathways results in cell death (Figure 7a).Although this classical view gives an elegant explanation for SL, it only presents a partial one, mainly in terms of loss-of-function (inactivation) events.However, in general SL could also involve gain-of-functions (activation) events.Moreover, in the context of cancer, this model caters mainly to tumour-suppressor genes.For example, the loss of function in two parallel DNA-damage repair (tumour suppressor) pathways can lead to a considerable accumulation of DNA damage, resulting in genomic catastrophe and triggering apoptosis in cancer cells, as in the case of BRCA-PARP [11].However, tumoursuppressor genes in general can be difficult to target because of the (unknown) side-effects these could have on normal cells [51,52], and because these are infrequently (over-) expressed in cancers to enable their targeting.Interestingly, many of the SL interactions that are extrapolated from lower-order organisms turn out to be tumour-suppressor genes (e.g.SMARCB1, see Introduction) and these are rarely altered in human cancers.We suspect that many of these highly conserved genes are much less susceptible to alterations than are newer inventions in humans, and consequently do not form attractive therapeutic targets in human cancers.
Here, we extend these pathway models [13,14] to include gain-of-function (activation) events (Figure 7).In addition to the parallel-pathway model (Figure 7a) we propose a negative feedback-loop model (Figure 7b) wherein the forward path involves a gain-of-function event (often in an oncogene) whereas the negative-feedback loop involves a loss-of-function event (often in a tumour-suppressor gene).Cell viability is enhanced by activation events in the forward path or reciprocally by inactivation events in the negative-feedback loop.We hypothesize that, in the event of loss of a DDR gene in the negative-feedback loop, the simultaneous activation of an oncogene in the forward path could be detrimental to the cell's survival by generating genomic instability.Consequently, to maintain an optimal condition for survival, cancer cells harbour only one of the two events resulting in mutual exclusivity between these events.We suspect the PIK3CA-PTEN combination is one such case: the PI3K pathway either harbours frequent activation events in the oncogenic PIK3CA kinase (96/156 breast tumours) resulting in accelerated cell growth and proliferation or reciprocally frequent inactivation events in the tumour suppressor PTEN (67/156) resulting in loss of negativefeedback to control cell proliferation; however we rarely see breast tumours harbouring both these events (8/156; p-value≈0) possibly due to their detrimental effect on cancer cell survivability.
Likewise, the KRAS-NF1 combination also fits into this pathway model.Simultaneous activation of the KRAS oncogene together with inactivation of NF1 tumour suppressor could be lethal to cell survival, and hence these two events rarely co-occur (p-value = 0.001).
Another example of SL is between BRCA1 and CCNE1, which although are not components of the same physical pathway, are functionally related due to their roles in the cell cycle and therefore broadly fit into our proposed model.BRCA1, being a tumour suppressor and a regulator of DNA-damage repair, has a reciprocal role to CCNE1 whose overactivation accelerates cell divisions and confers replication stress and genomic instability.CCNE1 amplification/overexpression is mutually exclusive to BRCA1 deletion/underexpression in ovarian cancers (p-value = 0.073).Consequently, loss of BRCA1 is synthetic lethal to cells harbouring CCNE1 amplifications, and this has recently been validated using inhibition of BRCA1-mediated DNA repair in ovarian cancer cell lines [53].
While the induction of cell lethality for certain combinations of genetic events seems a compelling reason for the observed lack of tumour samples containing these combinations, an alternative explanation could be that cells use mutual exclusivity as a means to achieve multiplicity in phenotypes.For example, it is possible that PIK3CA activation and PTEN inactivation are two (disjoint) paths to achieve two distinct phenotypes.One observation in support of this is that PIK3CA-activated breast tumours tend to be luminal, whereas PTENinactivated breast tumours tend to be basal-like [43]; however, harbouring simultaneous events in both genes is not additively advantageous to cells.A similar explanation also underlies the mutual exclusivity for KRAS and EGFR mutations seen in lung cancer [54].
Another example is CDH1-PTK2.Here, the tumour suppressor CDH1 is responsible for maintaining cell adhesion and regulating cell migration.The focal adhesion kinase PTK2 is responsible for disassembly of cell adhesions and promoting cell proliferation and migration.CDH1 inactivation is mutually exclusive to PTK2 activation in breast cancer (p-value = 0.001).During tumour development and in particular during metastasis, the inactivation of CDH1 or alternatively the activation of PTK2 could be two disjoint paths to achieve cell migration to distant sites.

Genes B as targets in cancer
Consistent with these pathway models, we expect that targeting B in the context of deletion/downregulation of A could result in cancer-cell death by either disrupting both survival pathways (Figure 7a) or by shutting off (forward) signals for cell survival (Figure 7b).In the latter case, targeting gene B irrespective of the (deletion/downregulation) status of A could result in cancer cell death, a scenario referred to as "oncogene addiction" [55,56].Our proposed model (Figure 7b) subsumes this scenario, and hence presents a more-general strategy for targeting oncogenes under the synthetic lethality paradigm.

Conclusion
In recent years, SL has emerged as an attractive therapeutic strategy against cancer: by targeting the SL partners of altered genes, cancer cells can be selectively killed while normal cells are spared.Here we introduce a computational approach to infer SL interactions based on the frequency at which genes are altered in human cancers.It is based on the observation that pairs of genes that are altered in a (significantly) mutually exclusive manner in cancers are likely to constitute lethal combinations.Using omics datasets across breast, prostate, ovarian and uterine cancer, we identify 718 genes that are upregulated or amplified in cancers, and are likely to be synthetic lethal with six key DDR genes.Computational validation of our predicted genes using essentiality data from cell-line screens shows that these genes are among the top essential genes and therefore likely to be lethal upon knockdown in these cell lines.We intend to validate some of these genes using single-and double knockdown in cell line and in vivo cancer models.[31,32].The left-hand plots compare the ranges for GARP scores of our predicted genes B (amplified/upregulated) with that of the entire set (~16000) of profiled genes.While it is difficult to directly compare the two ranges because of the difference in the number of genes in them, for majority of the cell lines the gene B at the 25 th percentile had lower GARP scores than the corresponding gene from the entire profiled set.By χ 2 test, genes B were significantly enriched (p<10 -5 ) within the top-quartile essential genes in these cell lines.The right-hand plots show GE ranks vs ME ranks for genes B in cell lines that are deficient in genes A:   [31,32].The left-hand plots compare the ranges for GARP scores of our predicted genes B (deleted/downregulated) with that of the entire set (~16000) of profiled genes.While it is difficult to directly compare the two ranges because of the difference in the number of genes in them, for majority of the cell lines the gene B at the 25 th percentile had lower GARP scores than the corresponding gene from the entire profiled set.By χ 2 test, genes B were significantly enriched (p<10 -5 ) with the top-quartile essential genes in these cell lines.

List of abbreviations
The right-hand plots show GE ranks vs ME ranks for genes B in cell lines that are deficient in genes A: (a) BRCA1, (b) BRCA2, and (c) PTEN. between PTEN-/-and PTEN wild-type isogenic cell lines.We considered MCF7, which does not have any known DDR defect, as our control.Comparisons of GARPscore means for genes B between DDR-deficient lines and MCF7 showed significant differences (ANOVA p<0.0001) between these cell lines.Similarly, comparison of GARP scores between two isogenic HCT116-derived cell lines, one with PTEN-/-and the other with wild type PTEN showed significant difference (paired t-test: p<0.0001) between the two cell lines.This analysis indicated that the essentiality/lethality of genes B is specific to DDRdeficient/PTEN-deficient cell lines, and therefore context-dependent on DDR deficiency.
Figure 6: Snapshot of the proportion of cases and Kaplan-Meier survival plots for predicted genes B (amplified/upregulated) using survival data (untreated) from 1000 breast cancer patients, plotted using KMPlotter-breast (http://www.kmplot.com/[48]).The patients are divided into two groups based on the overexpression (upper tertile of expression levels) and underexpression (below the upper tertile of expression levels) for these genes.2: Cell lines and the defects they harbour for DNA-damage response (DDR) genes.Genome-wide (~16000 genes) essentiality data [31,32] from ten cancer cell lines that harbour defects (mutations MUT, downregulation DOWN or homologous deletions HOMDEL) in at least one of the DDR genes ATM, BRCA1, BRCA2, PTEN and TP53.Additional files provided with this submission:

Figure 2 :
Figure 2: Proportions of synthetic lethal (mutually exclusive) interactions identified for each of the six DDR genes A (a) across all cancers; and (b) in the individual cancers of breast, prostate and ovarian (uterine cancer by itself has too few samples to identify any significant interactions) at three levels of significance (p < 0.05 and 0.01).Here, the SL partners B are deleted/downregulated.

Figure 3 :
Figure 3:Validation of synthetic lethal interactions against GARP essentiality scores from cell line screens[31,32].The left-hand plots compare the ranges for GARP scores of our predicted genes B (amplified/upregulated) with that of the entire set (~16000) of profiled genes.While it is difficult to directly compare the two ranges because of the difference in the number of genes in them, for majority of the cell lines the gene B at the 25 th percentile had lower GARP scores than the corresponding gene from the entire profiled set.By χ 2 test, genes B were significantly enriched (p<10 -5 ) within the top-quartile essential genes in these cell lines.The right-hand plots show GE ranks vs ME ranks for genes B in cell lines that are deficient in genes A: (a) ATM, (b) BRCA1, (c) BRCA2, (d) PTEN and (e) TP53.
Figure 3:Validation of synthetic lethal interactions against GARP essentiality scores from cell line screens[31,32].The left-hand plots compare the ranges for GARP scores of our predicted genes B (amplified/upregulated) with that of the entire set (~16000) of profiled genes.While it is difficult to directly compare the two ranges because of the difference in the number of genes in them, for majority of the cell lines the gene B at the 25 th percentile had lower GARP scores than the corresponding gene from the entire profiled set.By χ 2 test, genes B were significantly enriched (p<10 -5 ) within the top-quartile essential genes in these cell lines.The right-hand plots show GE ranks vs ME ranks for genes B in cell lines that are deficient in genes A: (a) ATM, (b) BRCA1, (c) BRCA2, (d) PTEN and (e) TP53.

Figure 4 :
Figure 4:Validation of synthetic lethal interactions against GARP essentiality scores from cell line screens[31,32].The left-hand plots compare the ranges for GARP scores of our predicted genes B (deleted/downregulated) with that of the entire set (~16000) of profiled genes.While it is difficult to directly compare the two ranges because of the difference in the number of genes in them, for majority of the cell lines the gene B at the 25 th percentile had lower GARP scores than the corresponding gene from the entire profiled set.By χ 2 test, genes B were significantly enriched (p<10 -5 ) with the top-quartile essential genes in these cell lines.The right-hand plots show GE ranks vs ME ranks for genes B in cell lines that are deficient in genes A: (a) BRCA1, (b) BRCA2, and (c) PTEN.

Figure 5 :
Figure 5: Differential essentiality of genes B (a) between nine DDR-deficient and MCF7 cell lines; and (b)between PTEN-/-and PTEN wild-type isogenic cell lines.We considered MCF7, which does not have any known DDR defect, as our control.Comparisons of GARPscore means for genes B between DDR-deficient lines and MCF7 showed significant differences (ANOVA p<0.0001) between these cell lines.Similarly, comparison of GARP

Figure 7 :
Figure 7: Two models for pathway-based targeting of synthetic lethal genes B in conjunction with deleted/downregulated genes A: (a) parallel pathways model where targeting B results in disruption of both survival pathways, and (b) negative feedback-loop model where targeting B shunts of (forward) signals for cell survival.

Figure 7
Figure 7 Let S A (respectively, S B ) be the subset of S in which A (respectively, B) is affected by E, and let S AB = S A ∩ S B .The mutual exclusivity between A and B with respect to E can be defined as both |S A |/|S AB | and |S B |/|S AB | approaching infinity as |S| approaches infinity.Given this mutual exclusivity we infer that A and B are synthetic lethal with each other.|S A |/|S AB | and |S B |/|S AB | both approach infinity as |S| approaches infinity if, and only if, the co-occurrence of E in A and B affects cell viability, and therefore A and B are synthetic lethal with each other.
Basis for the claim: Suppose the co-occurrence of E in A and B is lethal to the cells but not in either A or B alone.Then, S AB = φ or a very small proportion of S. Therefore, |S A |/|S AB | and |S B |/|S AB | approach infinity when S is large.Conversely, |S A |/|S AB | and |S B |/|S AB | both approaching infinity implies that as event E in A occurs more often and E in B occurs more often, the co-occurrence of E in A and B occurs less often.But, if A and B are independent, we do not expect to see this pattern.So, the occurrences of E in A and B are avoiding each other in viable cells, and thus their cooccurrence affects cell viability.