Total network controllability analysis discovers explainable drugs for Covid-19 treatment

Background The active pursuit of network medicine for drug repurposing, particularly for combating Covid-19, has stimulated interest in the concept of structural controllability in cellular networks. We sought to extend this theory, focusing on the defense rather than control of the cell against viral infections. Accordingly, we extended structural controllability to total structural controllability and introduced the concept of control hubs. Perturbing any control hub may render the cell uncontrollable by exogenous stimuli like viral infections, so control hubs are ideal drug targets. Results We developed an efficient algorithm to identify all control hubs, applying it to a largest homogeneous network of human protein interactions, including interactions between human and SARS-CoV-2 proteins. Our method recognized 65 druggable control hubs with enriched antiviral functions. Utilizing these hubs, we categorized potential drugs into four groups: antiviral and anti-inflammatory agents, drugs acting on the central nervous system, dietary supplements, and compounds enhancing immunity. An exemplification of our approach’s effectiveness, Fostamatinib, a drug initially developed for chronic immune thrombocytopenia, is now in clinical trials for treating Covid-19. Preclinical trial data demonstrated that Fostamatinib could reduce mortality rates, ICU stay length, and disease severity in Covid-19 patients. Conclusions Our findings confirm the efficacy of our novel strategy that leverages control hubs as drug targets. This approach provides insights into the molecular mechanisms of potential therapeutics for Covid-19, making it a valuable tool for interpretable drug discovery. Our new approach is general and applicable to repurposing drugs for other diseases. Supplementary Information The online version contains supplementary material available at 10.1186/s13062-023-00410-9.

Fig. S3.The biological-process enrichment of the 612 non-druggable control hubs within the 2-step community.

Legends for tables S1 to S5
Other Supplementary Materials for this manuscript include: Result S1.Control hubs as drug targets for Covid-19 treatment SLC10A1 and SLC10A6 are members of the family of sodium/bile acid cotransporters, also known as Na +dependent taurocholate co-transporting polypeptides (NTCP).Besides the involvement in cholesterol homeostasis, SLC10A1, the founding member of the SLC10A family, takes part in HBV and HDV infections as a receptor of viral entry 1 .Such antiviral activities allude to its potential function in SARS-CoV-2 infection.SLC10A1 is targeted by 18 drugs, among which 9 have entered clinical trials for Covid-19 treatment, and SLC10A6 by two drugs.Four of these 18 drugs targeting SLC10A1 are worth mentioning.The first is Conjugated estrogens, which also target COMT, as discussed earlier.The second is Progesterone, another type of female hormone (Progestin), which can boost innate inflammatory responses.It is effective in reducing the severity of Covid-19 in pilot clinical studies 2,3 .The third is Indomethacin, a non-steroidal anti-inflammatory drug with functions of antitumor activities and antiviral activities against hepatitis B virus, rhabdovirus, and vesicular stomatitis virus.Importantly, it has been shown to relieve symptom severities and maintain oxygen saturation levels in Covid-19 patients 4,5 .The fourth is Cyclosporine A (CsA), an immunosuppressive drug widely used in the prevention of graft rejection in organ transplants.A few in vitro studies demonstrated the antiviral function of CsA against SARS-CoV-2 replication [6][7][8] , and several clinical trials are currently underway 9 .Interestingly, SLC10A6 is targeted by two drugs that are distinct from the ones targeting SLC10A1.The first drug, Pregnenolone, is a hormone naturally produced in the human adrenal gland or from cholesterol and is a precursor to many other hormones including Progesterone, Estrogen, and dehydroepiandrosterone (DHEA) 10 .While Pregnenolone has been often used in treating neuropsychiatric disorders, its potential function in Covid-19 is due to its emerging anti-inflammatory role in repressing the Toll-Like Receptor 4 (TLR4) signaling pathway 11 which plays an important part in initiating the innate immune response 12 .The second drug targeting SLC10A6 is Prasterone sulfate or dehydroepiandrosterone sulfate (DHEA-S), which is a natural androstane steroid released from the adrenal gland and used as a labor inducer in childbirth.DHEA-S has been indicted in inflammatory diseases 10 and extremely low levels of DHEA-S have been detected in patients with septic shock due to cytokine storms 13 .Remarkably, DHEA-S and DHEA levels are inversely correlated with the abundance of serum interleukin-6 (IL-6) 14 , whereas Covid-19 patients have a significantly elevated amount of proinflammatory cytokines, especially IL-6 15,16 .In short, the large number of drugs targeting the two members of the NTCP family attempt to boost essential hormones to enhance immunity against SARS-CoV-2 infection and suppress excessive proinflammatory cytokines that may potentially adversely cause organ damage.
MUC1 encodes a transmembrane protein in the mucin family that plays an essential role in forming protective mucous barriers on mucosal epithelial cell surfaces in various tissues and organs, including the lung, stomach, and pancreas 17 .An elevated abundance of MUC1 is indicative of the development of acute lung injury (ALI) and acute respiratory distress syndrome (ARDS) 18 , which are symptoms of Covid-19 patients 19,20 .MUC1 is targeted by Potassium nitrate, an ingredient in toothpaste to alleviate tooth sensitivity to temperature and acids.A small clinical trial with 5 patients showed that the medicine helps bring the levels of oxygen saturation to above baselines in Covid-19 patients 21 .While MUC1 does not seem to be a direct target of Fostamatinib, which targets ten control hubs including RIPK1 (Tables 2, S5), it has been shown that Fostamatinib can reduce MUC1 expression as a repurposed drug for Covid-19 22 .
TTPA is a protein that binds α-tocopherol, a form of vitamin E, and regulates vitamin E levels by transporting vitamin E between membrane vesicles and facilitating vitamin E secretion from hepatocytes to circulating lipoproteins.Clinical studies reveal that the baseline plasma levels of α-tocopherol are lower than normal in patients with ARDS 23 and vitamin E is beneficial for relieving the burden of upper respiratory tract infections 24 .Therefore, it is not unexpected that TTPA is the target of six vitamin E supplements, which have been recommended as adjuvant therapy against SARS-CoV-2 infection 25,26 .
Note that the 2-step community also hosted 612 control hubs that were not targets of the existing drugs (Table S4B).We postulated that they were ideal candidate targets for new drugs.It was encouraging that these control hubs were enriched with membrane proteins and proteins functioning on the NF-κB signaling pathway (Figure S3), so the result invited further investigation for new drug discovery.

Method S1. Node ranking methods
Several methods are available for ranking nodes based on their connectivity and network topologies.In the sequel, a network with  nodes is considered.The degree centrality 27 of node , dc  =   −1 , is determined by the number of neighbors connected to the node.The average neighbor degree 28  , where  , () is the number of shortest paths between nodes  and  going through node , and  , is the number of all shortest paths between nodes  and .The load centrality is slightly different from the betweenness centrality and can be calculated according to a method from newman 30,31 .The closeness centrality 32 is written as , where  , is the length of the shortest path between nodes  and , and  is the number of nodes reachable from node .The Eigenvector centrality 33,34 measures the importance of a node based on the centrality of its neighbors.At convergence, it can be expressed by Ax = λx, where A is the network adjacency matrix with eigenvalue λ.The clustering coefficient 35 , where   is the number of triangles including node  and   is the degree of node .The K-core 36 of a node corresponds to the largest subnet with node degree k or greater.The core value of a node is the largest value  containing the node.Page rank 37,38 is a node ranking method based on network structures and is used to measure the importance of a web page relative to the other pages.All these ranking methods are available in networkx 39 and coded in python.

Method S2. Enrichment analysis
Let U be the universe and F the set with a feature of interest.We were interested in the enrichment of the feature for a given set D. For example, consider all proteins in the human PPI network that were no more than 2-steps away from viral proteins (i.e., the universe U) and the subset of these proteins that were drug targets (i.e., set F with the feature).We were interested in the enrichment of drug targets (i.e., the feature) for the control hubs (i.e., the given set D).The feature enrichment of D can be computed as   =  ∩ .To assess the enrichment significance for   , a series of random sampling and a statistical test were carried out.A random sample S was generated by randomly drawing |D| items (proteins) from U.   =  ∩  was the subset of S with the feature of interest and a measure of feature enrichment for S.An empirical distribution of   can be derived from multiple random samplings of S. This empirical distribution can be taken as a baseline enrichment of the feature for items in U. A z-test was then adopted to evaluate the difference and significance between  and the baseline   .The z-test, modeled as a two-tailed Gaussian distribution, was conducted based on  =   − (  )

𝑆𝐷 𝑜𝑓 𝑆 𝐹
, where     was the standard deviation of   from, say 1,000, samples.The significance of the enrichment of   is quantified by the p-value, which was from the standard normal distribution cumulative probability table.
The difference between an empirical normal distribution of   and another empirical normal distribution of   ′ was analyzed using Pearson's Chi-squared test, as , where  was the number of intervals;   was the frequency of interval  in theoretical observed distribution  (i.e., the frequency of baseline distribution   );   was the frequency of interval  in observed distribution  (as the frequency of enrichment distribution of all driver nodes).The final difference χ 2 -value was calculated, and the significance p-value was from the Chi-square distribution table.

Method S3. Identifying control hubs of complex networks
Based on structural controllability theory, for a directed network G(V, E), the matching edges of a maximum matching form the cactus structures in the network, which are the basic control structure of the network.Therefore, the matched edges form a set of edge-independence paths in the directed network G(V, E), we call these paths as control paths 40 .The control paths start with driver nodes and end with tail nodes.The driver nodes (unmatched nodes) and the corresponding control path in the network are called a control scheme.Such a node always remains as a middle node of a control path in all control schemes and thus is referred to as a control hub.An eminent feature of a control hub is that it is essential for controlling the network regardless of which control scheme is applied to the network.A perturbation to any control hub may make the network uncontrollable by any control scheme.Therefore, it is critically important to protect all control hubs to maintain structural controllability.So, we need to find all control hubs.This seemed to require computing all control schemes, which is #P-hard 41 .We developed an efficient algorithm without computing all control schemes.The process for identification of all control hubs is as follows 42 : ALGORITHM: Identifying Control Hub Find all alternating paths AP from all unmatched nodes based on the matching M from B(Vin, Vout, E);

8.
Set M=M' obtained by expanding augmenting paths;

9.
Clear H; Find all alternating paths AP from all unmatched nodes based on the matching M from B'(Vout, Vin, E);

19.
Set M=M' obtained by expanding augmenting paths;

20.
Clear T;    S4A.Captions for Tables S1-S5, which are the individual tabs in a single Excel spreadsheet file that includes the raw data or results.We also put them in GitHub (https://github.com/network-control-lab/control-hubs).
Table S1: Human 9,092 proteins and 64,006 interactions in Huri-Union PPI.We constructed a triple-layer network of PPIs between humans and SARS-CoV-2 and within humans as well as interactions between drugs and human target proteins.The middle layer of the triple-layer network comes from the Huri-Union database, which consists of 9,092 proteins or nodes (in Table S1A) and 64,004 interactions or edges (in Table S1B).S4A), and the rest 612 control hubs (in Table S4B) are potential targets for new drugs.

Fig. S1 .
Fig. S1.The enrichment of control hubs and druggable control hubs within the k-step community.

Fig. S1 .
Fig. S1.The enrichment of control hubs and druggable control hubs within the k-step community.Identification of the community of human proteins k-steps away from SARS-CoV-2 proteins which are enriched with control hubs and druggable control hubs.A) A statistical analysis, a series z-test, was adopted to compare the number of control hubs within a k-step community against a random empirical distribution (see Methods).The numbers of control hubs within the communities are shown in red lines, and their respective baseline random distributions are shown in blue bars.The 2-step community was enriched with control hubs.B) Similarly, the 2step community was enriched with druggable control hubs.

Fig. S2 .
Fig. S2.Network topologies of 22 SARS-Cov-2 proteins and 65 druggable control hubs within the 2-step community.The major drugs and the number of drugs targeting the control hubs are shown in red boxes, only one drug is shown for each control hub.Detailed information is available in TableS4A.

Fig. S3 .
Fig. S3.The biological-process enrichment of the 612 non-druggable control hubs within the 2-step community, revealing their collective functions during viral infection.GeneRatio is the ratio between the number of observed proteins with a specific Go term and the total number of proteins of interest.
29 defined as nd  = ) are the neighbors of node  and   is the degree of node .The betweenness centrality29is bc  =

Table S2 : 169 PPIs between human and SARS-CoV-2 proteins
. 169 high-confidence virus-host interactions reported by Gordon et al. are used to establish our triple-layer network, which consists of 22 SARS-CoV-2 proteins and 169 host proteins.

Table S3 :
Drugs and their human protein targets.The drug-target interactions(17,780interactions between 2,913 targets and 2,979 drugs) collected from the Drugbank database are regarded as the third layer of the triple-layer network.The drugs in this network are FDA-approved or under clinical investigation for the treatment of Covid-19.

Table S4 : 677 Control hubs in the human PPI network which are no more than two steps away from a SARS - Cov-2 protein.
Among the 677 control hubs, 65 are druggable control hubs targeted by at least one drug (in Table

Table S5 : 185 candidate drugs for Covid-19 therapy and prevention
. A total of 185 candidate drugs are identified from FDA-approved or investigated drugs targeting 65 druggable control hubs.