Literature DB >> 35482476

Dysregulated Ligand-receptor interactions from single cell transcriptomics.

Qi Liu^1,2, Chih-Yuan Hsu^1,2, Jia Li^1,2, Yu Shyr^1,2.

Abstract

MOTIVATION: Intracellular communication is crucial to many biological processes, such as differentiation, development, homeostasis, and inflammation. Single cell transcriptomics provides an unprecedented opportunity for studying cell-cell communications mediated by ligand-receptor interactions. Although computational methods have been developed to infer cell type-specific ligand-receptor interactions from one single cell transcriptomics profile, there is lack of approaches considering ligand and receptor simultaneously to identifying dysregulated interactions across conditions from multiple single cell profiles.
RESULTS: We developed scLR, a statistical method for examining dysregulated ligand-receptor interactions between two conditions. scLR models the distribution of the product of ligands and receptors expressions and accounts for inter-sample variances and small sample sizes. scLR achieved high sensitivity and specificity in simulation studies. scLR revealed important cytokine signaling between macrophages and proliferating T cells during severe acute COVID-19 infection, and activated TGF-β signaling from alveolar type II cells in the pathogenesis of pulmonary fibrosis. AVAILABILITY: scLR is freely available at https://github.com/cyhsuTN/scLR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Chemical

Year: 2022 PMID： 35482476 PMCID： PMC9191214 DOI： 10.1093/bioinformatics/btac294

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.931

1 Introduction

Cells constantly communicate with each other to orchestrate their behaviors, ensuring normal functions of tissues, organs and bodies. One important mode of intercellular communication is ligand–receptor interactions, where ligands from a ‘sender’ cell bind to receptors in a ‘receiver’ cell that triggers response inside the cell. The recent advance of single-cell RNA sequencing technology provides a powerful way to study ligand–receptor interactions at an unprecedented scale and depth (Almet ; Armingol ). Deciphering ligand–receptor interactions from single-cell transcriptomics has become routine for understanding the biological systems (Martin ). Computational methods have been developed rapidly to infer cell-cell communications mediated by ligand–receptor interactions from single-cell transcriptomics (Browaeys ; Cabello-Aguilar ; Efremova ; Hou ; Hu ; Jin ; Noel ; Solovey and Scialdone, 2020; Zhang ). Ligand–receptor interactions are detected and quantified based on the pairwise expression of ligands and receptors between single cells or cell clusters. Some methods, such as ICELLNET (Noel ), determine the interaction strength by the product of ligands and receptors expressions. Other methods, such as CellChat (Jin ) and SingleCellSignalR (Cabello-Aguilar ), calculate the interaction by a non-linear transformation of the product of ligands and receptors expressions. The significance of cell type-specific interactions is estimated based on a null distribution generated by shuffling cluster labels of all cells. These methods have been successfully applied in a number of single-cell transcriptomics datasets, which uncovered key signaling mechanisms controlling cell fate and state transitions (Bassez ; Cheng ; Gong ; Hildreth ; Tian ). Existing methods focus primarily on the inference of ligand–receptor interactions between cells/cell clusters in one condition. The identification of dysregulated communications across conditions, however, is even more important for revealing potential driving signals. iTalk and CellChat predict up- (gained) or down-regulated (loss) interactions based on the differentially expressed ligands and/or receptors, where upregulated/gained interactions are defined based on upregulated ligands and/or receptors and vice versa. The differential analysis compares expressions from pooled cells between two conditions, which fail to consider inter-sample variances in each condition. Moreover, identification of dysregulated ligand–receptor interactions based on expression changes of ligands or receptors alone would result in false positives and false negatives. For example, a ligand–receptor interaction with an upregulated ligand but a downregulated receptor is probably not a strong candidate. We developed scLR to identify dysregulated ligand–receptor interactions across conditions. scLR not only models the distribution of the product of ligands and receptors expressions, but also accounts for inter-sample variances, small sample sizes and dropout events. scLR achieved high recall (sensitivity) and specificity in four simulation datasets. scLR revealed important cytokine signaling between macrophages and proliferating T cells in severe COVID-19 infection, and activated TGF-β signaling from alveolar type II cells in the pathogenesis of pulmonary fibrosis.

2 Materials and methods

2.1 Data preprocessing

The inputs of scLR are one raw gene–cell count matrix along with the cell cluster and the sample information (sample ID and condition) from single-cell RNA-seq data. The sample represents a replicate, such as one patient. For example, the sample size is 20 if the single-cell RNA-seq dataset is generated from10 tumor patients and 10 controls. scLR first sums gene counts in each cell cluster in each sample to estimate gene expression in the cluster and in the sample. Then scLR normalizes the summed data using a median normalization method in DESeq2, which assumes expression profiles have the same median value (Anders and Huber, 2010) (details in Normalization in the Supplementary File). After normalization, scLR transforms the data by log2 (x + 1) to obtain expression abundances for the ligand and the receptor in the cluster of sample , where and . is the number of clusters and is the number of samples.

2.2 The distribution of the product of ligand and receptor expressions

We assume and follow a bivariate normal distribution with correlation , where and denote the expression of ligand and receptor in the cluster and of sample , respectively. To simply the equation, we use and to represent and (Equation 1). the product of and , can be expressed as (Equation 2). where and . and are independent, which follow non-central chi-square distributions with the degree of freedom of 1 and the non-central parameters of and , respectively. As shown in (2), the distribution of is equal to that of the difference of two independent non-central chi-square random variables (regardless of constants), which depends on , , , and . The expectation .

2.3 Differential analysis of ligand–receptor interactions

scLR is aimed to identify cell-type specific dysregulated interactions between two conditions. It first calculates the interaction strength of each ligand–receptor pair between two cell clusters (the ‘sender’ and ‘receiver’) and then assesses its alteration between two conditions statistically. The interaction strength of ligand–receptor between clusters and in sample is measured by the product of the ligand and receptor in the corresponding cluster and of sample : . Assume there are samples from condition A and samples from condition B, where. The mean interaction strength of ligand and receptor between clusters and across all the samples under condition A can bedenoted as where . Similarly, . We compare the mean interaction strengths of ligand–receptor interactions under two different conditions. The null hypothesis is that the mean of ligand–receptor interaction strengths are the same under two conditions. The alternative hypothesis is that they are not the same. If the difference for some specified , we reject the null hypothesis and conclude that ligand and receptor interactions are significantly altered between two conditions. We use Monte Carlo simulation to estimate the null distribution of . We first estimate the means and standard deviations of the ligand and receptor expressions in clusters and under two conditions A and B, denoted by , , , , , and ,. To assess inter-sample variances accurately when the sample size is small, we use a Bayesian method to shrink the estimated variances toward a pooled estimate (Smyth, 2004), which is performed by eBayes in the R package limma. Moreover, we assume based on the observation of a very low percentage of significant non-zero correlations in real datasets, e.g. only 0.05% of ligand–receptor pairs in the COVID-19 data, 0.6% pairs were found in the IPF data and 0.03% were detected in AVP data at adjusted P-value < 0.05. The scLR package also provides a function to estimate (default = 0) in case there are a number of significant non-zero correlations. When the null hypothesis is true, simplified as between clusters and , we can assume , and based on Equation (2), where , , and are the non-central parameters of non-central chi-square distributions in conditions A and B. We let , and , where . Based on , and , we simulate the distribution of based on the Equation (2). We randomly choose samples to estimate and samples to derive . The difference between two conditions . When we repeat the simulation K times, we have . The required is the percentile of and the P-value is given by the proportion of . A decent number of simulations is needed to obtain an accurate estimation of the significance of dysregulated ligand–receptor interactions. It is very time consuming, however, if a large number of simulations are conducted for every ligand–receptor pair between any two cell clusters since there are thousands of ligand–receptor pairs and tens of cell clusters. To address this problem, we estimate the density of using a kernel density approach with the Gaussian kernel and a large data-driven bandwidth in each ligand–receptor pair comparison. The bandwidth controls smoothness of the estimated density. A small h gives very rough estimates while a large gives smoother estimates but weakens local features, for example, smoothing tails but weakening the mode(s) of the density. The main purpose here is to obtain an accurate tail probability rather than an overall density estimate. We use a data-driven approach to estimate h via the R function density (parameters: bw = ‘SJ’ and adjust = 3). Based on the estimated density , , the P-value is estimated by two times the tail probability beyond the .

2.4 Pseudo-counts for reducing false positives

The prevalence of dropout events is one of the greatest challenges in single-cell RNA sequencing data analysis, which introduces excessive number of zeros and unwanted technical variability. There is a close relationship between dropout rate and expression level, where genes with high dropout rate indicate low expression (Qiu, 2020). The excessive number of zeros make ligand–receptor interaction analysis even more difficult. The product would be zero if the ligand or the receptor expression in one condition are undetected, which leads to many false positives especially when sample sizes are small. To reduce the number of false positives caused by dropout events, we add pseudo-count of 1 to the zero values. Comparing scLR with and without pseudo-counts using simulation data, we found that scLR with pseudo-counts achieved higher precision and specificity (i.e. lower false positive rates) (Supplementary Fig. S1). The scLR package also provides a parameter (zero.impute, default=TRUE) to turn off pseudo-counts.

2.5 Single-cell RNA-seq datasets

We used three single-cell RNA-seq datasets to evaluate the performance of scLR, including data generated from patients with and without COVID-19 infection (Grant ), pulmonary fibrosis lungs (PF) and non-fibrotic controls (Habermann ), and patients with Anterior vaginal prolapse (AVP) and controls (Li ). Three datasets were all generated by 10X platform. The COVID-19 data were downloaded from GSE155249, including five patients with severe COVID-19 infection and two COVID-19 negatives. There were 33 000 cells with 18 337 genes, clustered into six cell types, macrophages, dendritic cells (DC), T cells, proliferating T cells, ciliated and alveolar cells. The PF data were downloaded from GSE135893, consisting of 11 PF patients and nine controls. There were 70 512 cells with 33 694 genes, classified into 12 cell types. The AVP data were downloaded from GSE151202, involving 16 AVP patients and five controls. There were 53 133 cells with 20 328 genes, categorized into seven cell types. Multivariate normality test showed that only 2.1% of ligand–receptor pairs were significant different from bivariate normal distributions in the COVID-19 data, 12.1% pairs in IPF data and 1.6% in AVP data at an adjusted P-value < 0.05. These results suggested that a small percentage of ligand–receptor pairs might be modeled poorly by bivariate normal distributions. For those pairs violating normality assumption, we found that rectified normal distributions and gamma distributions fitted the data well, especially rectified normal distributions.

3 Results

3.1 Simulation

3.1.1 Simulation settings

We designed four scenarios to evaluate the performance of scLR (Table 1). Each scenario has three samples in each of two conditions. Both ligands and receptors are upregulated in the first setting, whereas only ligands or receptors are upregulated in the second scenario. The third setting is challenging, where ligands and receptors are overexpressed complementarily, e.g. highly upregulated ligands but slightly or non-upregulated receptors in one sample and vice versa in the other sample, leading to increased products of ligand and receptor expressions. Upregulated interactions in this scenario are not driven by ligands or receptors alone, but the complementary increase of ligands and receptors. We simulated complementary increase by upregulation of ligands but slight overexpression of receptors in the samples 1 and 3, and upregulation of receptors but slight overexpression of ligands in the sample 2. The fourth scenario is a special case, where ligands are upregulated but receptors are downregulated, resulting in unchanged ligand–receptor interactions. There are 2000 ligand–receptor pairs in each setting. There are 500 significant and 1500 non-significant pairs in the first three scenarios. There are no true positives but 2000 non-significant pairs in the fourth scenario. Every sample in the simulation studies consists of 16 672 genes and 11 cell clusters. The expression abundances were generated from normal distributions, where the parameters of means and standard deviations for each gene in each cell cluster were derived from real single-cell RNA-seq datasets. The normalized counts were increased/decreased to simulate the upregulation/downregulation on the expression abundances (see Simulation Settings in the Supplementary File). Ten simulated datasets were generated for each scenario.

Table 1.

Simulation settings in four scenarios

	Ligand	Receptor
Scenario 1	(↑, ↑, ↑)	(↑, ↑, ↑)
Scenario 2	(↑, ↑, ↑)	(–, –, –)
Scenario 2	(–, –, –)	(↑, ↑, ↑)
Scenario 3	(↑, ↑, ↑)	(↑, ↑, ↑)
Scenario 4	(↑, ↑, ↑)	(↓, ↓, ↓)

Note: Three samples in each condition.

Simulation settings in four scenarios Note: Three samples in each condition.

3.1.2 Performance on simulation studies

We compared scLR with Welch’s t-test and differential gene expression analysis by limma under four simulation settings. Welch’s t-test was used to compare the mean difference of the product of ligands and receptors expressions between two conditions based on the assumption that the product follows a normal distribution. Limma was performed to identify differential expression in ligands or receptors, where ligand–receptor pairs with either ligands or receptors differentially expressed between two conditions were considered as dysregulated ligand–receptor interactions. We evaluated the performances based on the following metrics: (i) the F1 score curve at different thresholds; (ii) the recall (sensitivity) and specificity curve at different thresholds; and (iii) the number of true positives (TP), precision (TP/TP+FP), recall (TP/TP+FN), specificity (TN/TN+FP) and F1 score at the threshold of adjusted P-value < 0.01. In the first scenario where dysregulated ligand–receptor interactions were caused by both upregulation in the ligand and receptor expressions, scLR showed the best performance. scLR achieved much higher F1 score and recall than the other two methods at different thresholds (Fig. 1A). The three methods obtained similar specificity (Fig. 1A). At the threshold of adjust P-value < 0.01, scLR recovered ∼430 out of 500 significant ligand–receptor pairs (84% recall), while t-test only identified ∼100 pairs (20% recall) and limma detected ∼330 pairs (66% recall) (Supplementary Fig. S2). When there were an increasing percentage of pairs (ranging from 0% to 90%) violating the normality assumption (either from Gamma or rectified normal distributions), the specificity of scLR was controlled well but the recall decreased (Supplementary Fig. S3). scLR showed much better performance in terms of recall and F1 scores than limma and t-test when there were 50% of non-normal pairs (Gamma or rectified normal distributions), indicating the power of scLR decreased less than limma and t-test as the percentage of non-normal pairs increased (Supplementary Fig. S4).

Fig. 1.

Performance evaluation in simulation studies. Performance in the first scenario (A), the second scenario (B) and the third scenario (C).

Performance evaluation in simulation studies. Performance in the first scenario (A), the second scenario (B) and the third scenario (C). In the second scenario where dysregulated ligand–receptor interactions were driven by either the ligand or the receptor alterations alone, scLR had slightly inferior performance than limma (Supplementary Fig. S5). The F1 and recall curve at different thresholds showed that scLR obtained better performance at rigorous cutoffs (0.001 < adjusted P-value < 0.01), while limma had better performance at relaxed cutoffs (0.01 ≤ adjusted P-value < 0.1) (Fig. 1B). The third scenario is challenging, where dysregulated ligand–receptor interactions are driven by complementary ligand and receptor alterations. scLR achieved the best performance, which had much higher F1 score and recall than t-test and limma at different thresholds (Fig. 1C). limma, relying on differential expression of either ligands or receptors alone, showed poor performance. scLR identified ∼330 interacting pairs with 64% recall, whereas t-test and limma both failed to detect any ligand–receptor pairs at the threshold of adjusted P-value < 0.01 (Fig. 1C and Supplementary Fig. S6). The fourth scenario simulated a special case where none of dysregulations exist due to opposite changes in ligands and receptors expression. In this case, scLR and t-test performed better than limma, which obtained much lower false positives rates (Supplementary Fig. S7). The simulation studies demonstrated that scLR overall achieved higher performance than limma and t-test. Similar results were also obtained in the same four scenarios with five samples in each condition (Supplementary Figs S8–S11). Since dysregulated ligand–receptor interactions involve both ligands and receptors, methods considering ligands or receptors alone like limma are not effective, such as in scenarios 1, 3 and 4. T-test considers ligand–receptor interactions in terms of their products, but it is conservative due to both large estimates for the standard deviations of the products and the inappropriate normal distribution assumption for non-normal products (see Equation 2). scLR directly models the distribution of the product of ligands and receptors expressions, therefore, it achieved much better performance than t-test.

3.2 Identification of dysregulated ligand–receptor interactions in severe COVID-19

We applied scLR on single-cell RNA-seq data on BAL fluid collected from patients with severe SARS-COV-2 pneumonia and two control patients, one with bacterial pneumonia and one non-pneumonia control (Grant ). The integrated analysis of five patients with severe SARS-COV-2 pneumonia and two patients without SARS-COV-2 pneumonia resolved multiple clusters corresponding to macrophages, dendritic cells, T cells, proliferating T cells, ciliated and alveolar cells. Using the curated ligand–receptor database (LRdb) compiled by SingleCellSignalR (Cabello-Aguilar ), scLR identified 1313 dysregulated ligand–receptor interactions in SARS-COV-2 infection compared to non-SARS-COV-2 infection (adjusted P-value < 0.01) (Supplementary Table S1), of which 938 upregulated and 375 downregulated pairs. Among them, 16% of interactions did not have significantly altered ligands or receptors even at a loose cut off (adjusted P-value < 0.1) (Supplementary Fig. S12). Those interactions generally ranked lower than those with significantly altered ligands or receptors (ranking by adjusted P-values). Macrophages have the most disrupted autocrine ligand–receptor interactions (128), followed by 88 altered interactions from proliferating T cells to macrophages and 81 from macrophages to proliferating T cells, suggesting active communications between macrophage and proliferating T cells during SARS-COV-2 infection. The most upregulated ligand–receptor pairs involve several cytokines and chemokines that are important for T cell and monocytes recruitment and alveolar macrophage maturation, such as CCL2, CCL3, CCL4, CCL5, CCL7, CCL8, CXCL13, CXCL10, CXCL11 and CXCL16. Macrophages is the major source of dysregulated interactions involving CCL2, CCL7 and CCL8, while proliferating T cell is the source for CCL4, CCL3 and CCL5 (Fig. 2A). CCL2-CCR1 interaction is most upregulated across all cell type pairs, especially between macrophage itself. In the CXC family, CXCL10-CXCR3 is most upregulated in the macrophage-proliferating T cells communication (Fig. 2B). Recent studies reveal that innate immune interferons dysregulation is key to determine SARS-COV-2 pathogenesis. scLR detected the dysregulated IFNG-IFNGR1 and IFNG-IFNGR2 between proliferating T cells and other immune cells, especially between proliferating T and macrophages (Fig. 2C). These findings are consistent with recent studies reporting that macrophages drove the inflammatory response to SARS-COV-2 infection (Speranza ) and macrophages and T cells form a positive feedback loop that derives persistent alveolar infection (Grant ). In addition, scLR found that it is proliferating T cells that produce IFNG to induce inflammatory cytokine release from macrophages, such as CCL2, CCL7, CCL8, CXCL10 and CXCL16, which further promote T cell activation and proliferation.

Fig. 2.

Dysregulated ligand–receptor interactions in SARS-COV-2 infection compared to control. Dysregulated CC chemokine interactions (A), CXC chemokine interactions (B) and IFNG interactions (C). △LR is the difference of expression products of ligands and receptors between two conditions We compared scLR with CellChat and iTALK. Instead of considering ligands and receptors simultaneously, CellChat and iTALK detect dysregulated interactions based on expression changes of either ligands or receptors alone. Additionally, scLR calculates the difference at the sample-level, while CellChat and iTALK pool cells from all samples in one condition together to estimate the difference, which ignore sample-level variances and might lead to biased results. Using the same ligand–receptor database LRdb (Cabello-Aguilar ), CellChat identified 1720 dysregulated interactions (|log2FC|>0.2 & P-value < 0.01) (Supplementary Table S2) and iTALK discovered 4385 interactions (|log2FC|>0.2 & adjusted P-value < 0.01) (Supplementary Table S3). We focused on major findings from the original study to further investigate those interactions, which revealed that T cells produce IFNG to induce inflammatory cytokine release and further promote T cell activation (Grant ). We listed IFNG-related interactions between immune cells by scLR, CellChat and iTALK in the Supplementary Figure S13, along with differential expression results of ligands and receptors at the patient-level (scLR) and at the cell-level (CellChat and iTalk). scLR found IFNG-related interactions all upregulated after COVID-19 infection, including IFNG-IFNGR1 between proliferating T and all other cell types and IFNG-IFNGR2 between proliferating T and macrophages/DC. The upregulated interactions were mainly due to IFNG overexpression in proliferating T cells after COVID-19 infection (log2FC = 5.1 & adjusted P-value = 0.002, Supplementary Fig. S14). To be noted, the interaction of IFNG-IFNGR1 was detected to be significantly upregulated between proliferating T and T cells, whereas IFNG-IFNGR2 was not due to large variances of IFNGR2 in T cells (Supplementary Fig. S15). The same phenomena were observed for IFNG-IFNGR2 in the autocrine proliferating T interactions (Supplementary Fig. S16). CellChat and iTALK, in contrast, performed differential expression at cell-level by pooling cells, resulting in biased results toward patients with large number of cells. They detected upregulation of IFNGR1 and IFNGR2 in macrophages, and even their downregulation in proliferating T cells, which were not true at the patient-level (Supplementary Fig. S14). The downregulation of receptors (IFNGR1 or IFNGR2) led to undetermined/wrong directions of interaction changes. Another example about important CCL8-CCR5 interactions was described in the Supplementary File. In summary, the analysis strategy employed by scLR enables to find more biologically meaningful results than CellChat and iTALK. First, differential analysis at patient-level is able to find consistent changes across patients, such as IFNG upregulation in proliferating T cells. In contrast, differential analysis at cell-level by pooling cells together (CellChat and iTALK) was likely to be dominated by patients with large number of cells, resulting in biased results, such as false downregulation of IFNGR1 and IFNGR2 in proliferating T cells. Second, the strategy of considering ligands and receptors simultaneously helps remove false positives and false negatives. Although IFNG was strongly upregulated in proliferating T cells, scLR detected a non-significant change of IFNG-IFNGR2 between proliferating T and T cells due to large variances of IFNGR2 in T cells, which would be a false positive if only consider IFNG alone. Moreover, 16% of 1313 interactions did not have significantly altered ligands or receptors even at a loose cutoff (adjusted P-value < 0.1) (Supplementary Fig. S12), which would be missed if only consider ligands or receptors alone (false negatives). Third, scLR is able to rank dysregulated interactions by their statistical results, which is very useful to find the most important cell-type specific interactions. For example, the most dysregulated IFNG-interaction was found between proliferating T and macrophages (Supplementary Fig. S17).

3.3 Identification of dysregulated ligand–receptor interactions in PF and AVP

We applied scLR on two additional single-cell RNA-seq datasets generated from patients with pulmonary fibrosis (PF) (Habermann ) and Anterior vaginal prolapse (AVP) (Li ). The PF data contained 12 cell types, such as type II alveolar cells (AT2), fibroblasts, endothelial cells. Using the curated ligand–receptor database (LRdb) compiled by SingleCellSignalR (Cabello-Aguilar ), scLR identified 567 dysregulated ligand–receptor interactions in IPF patients compared to controls (adjusted P-value < 0.01) (Supplementary Table S4). The most upregulated interactions all involve TGF-β1 signaling from AT2 cells, such as TGFB1-ITGB6, TNC-ITGAV, TGFB1-ITGB1 and TGFB1-TGFBR1/TGFBR2. TGF-β1 is a master regulator of ECM accumulation and a key driver of lung fibrosis (Fernandez and Eickelberg, 2012; Yue ). Integrins αVβ3, αVβ5, αVβ8 and αVβ6 play vital roles in TGF-β activation in fibrotic disorders (Dong ; Fernandez and Eickelberg, 2012; Yue ). Expression of TNC, an extracellular matrix protein, is significantly upregulated in fibrotic lungs, which is induced by TGFB1 and contributes to TGF-β mediated lung fibrosis (Estany ). One recent study found that sustained elevated mechanical tension, the most common driver of lung fibrosis, activates a TGF-β signaling loop in AT2 cells and then leads to the periphery-to-center progression of lung fibrosis (Wu ). This is consistent with our results, which discovered strong TGF-β activation between AT2 cells and other cell types. The AVP dataset involved seven cell types, endothelial cells, fibroblasts, lymphatic endothelial, macrophages, myoepithelial cells (MEP), smooth muscle cells and T cells. scLR detected 34 dysregulated ligand–receptor interactions in AVP patients compared to control samples (adjusted P-value < 0.01) (Supplementary Table S5). Endothelial cells and fibroblast participate in the highest level of changes in ligand–receptor interactions. The most upregulated interactions are mainly related to ECM organization, such as APP-LRP1 and HSPG2-LRP1 between fibroblasts and endothelial cells, autocrine A2M-LRP1 in endothelial cells, autocrine ECM1-CACHD1 in fibroblasts. LRP1 regulates ECM remodeling (Gaultier ), while its partner A2M and APP are all involved in ECM organization. Heparan sulfate proteoglycan 2 (HSPG2), also known as perlecan, is a large multi-domain extracellular matrix proteoglycan. ECM1, extracellular matrix protein 1, is known to be upregulated in pelvic organ prolapse (Cecati ). This agrees with previous findings that POP is an acquired disorder of extracellular matrix (Budatha ).

4 Discussion

We presented scLR, a statistical method for differential analysis of ligand–receptor interactions on single-cell transcriptomics data. scLR models the distribution of the product of ligand and receptor expressions and takes the inter-sample variances, small sample sizes and dropout events into account. Overall, scLR achieved the best performance than other methods in four simulation settings, especially when the ligand and receptor are both dysregulated, or they have complementary alterations. Applied on single-cell RNA-seq datasets from severe SARS-COV-2 infection, scLR revealed the important dysregulated interactions between macrophages and proliferating T cells that lead to persistent infection. Moreover, scLR discovered activated TGF-β signaling from alveolar type II cells contributing to the pathogenesis of pulmonary fibrosis. scLR is designed for single-cell RNA-seq datasets with small sample sizes, which assesses sample variances by shrinkage of the estimated variances toward a pooled estimate using a Bayesian method. It can work for data with only one sample in either one of the two conditions based on the additional assumption that variances of ligand/receptor expressions in the two conditions are the same (Smyth, 2004). When there are enough samples, model-free methods, such as permutation, might be better for estimating the background distribution of expression products of ligands and receptors rather than assuming bivariate normal distributions. However, it is difficult to determine the exact number of sufficient sample sizes. The sample size depends on the number of permutations needed to generate an accurate P-value, whereas the minimum number of permutations is subject to the total number of ligand–receptor pairs and pair-wise cell types combinations. The current version of scLR only compares ligand–receptor interactions between two conditions. If there are any batch effects, they should be removed by single-cell RNA-seq batch correction methods (Tran et al., 2020) before the application of scLR. We plan to extend scLR for experiments with complex designs, which would broaden its applicability and also remove potential biases from confounders such as batch effects. Moreover, scLR only models the interactions between ligands and receptors without considering other stimulatory and inhibitory cofactors. Combining ligands, receptors and cofactors together would require new modeling framework. Identification of dysregulated LR interactions from single-cell RNA-seq alone would introduce false-positive results since cells only communicate within a certain distance. Integrating single-cell RNA-seq with spatial transcriptomics would further narrow down the important communications between cells.

Data availability

Data from COVID-19, pulmonary fibrosis and anterior vaginal prolapse can be accessed through the Gene Expression Omnibus (GEO) with accession numbers GSE155249, GSE135893, and GSE151202 respectively.

Funding

This work was supported by the National Cancer Institute [U2C CA233291 and U54 CA217450]; National Institutes of Health [P01 AI139449]; and Cancer Center Support Grant [P30CA068485]. Conflict of Interest: none declared. Click here for additional data file.

32 in total

1. Extracellular matrix proteases contribute to progression of pelvic organ prolapse in mice and humans.

Authors: Madhusudhan Budatha; Shayzreen Roshanravan; Qian Zheng; Cecilia Weislander; Shelby L Chapman; Elaine C Davis; Barry Starcher; R Ann Word; Hiromi Yanagisawa
Journal: J Clin Invest Date: 2011-04-25 Impact factor: 14.808

Review 2. The impact of TGF-β on lung fibrosis: from targeting to biomarkers.

Authors: Isis E Fernandez; Oliver Eickelberg
Journal: Proc Am Thorac Soc Date: 2012-07

3. The landscape of cell-cell communication through single-cell transcriptomics.

Authors: Axel A Almet; Zixuan Cang; Suoqin Jin; Qing Nie
Journal: Curr Opin Syst Biol Date: 2021-03-26

4. A single-cell map of intratumoral changes during anti-PD1 treatment of patients with breast cancer.

Authors: Ayse Bassez; Hanne Vos; Laurien Van Dyck; Giuseppe Floris; Ingrid Arijs; Christine Desmedt; Bram Boeckx; Marlies Vanden Bempt; Ines Nevelsteen; Kathleen Lambein; Kevin Punie; Patrick Neven; Abhishek D Garg; Hans Wildiers; Junbin Qian; Ann Smeets; Diether Lambrechts
Journal: Nat Med Date: 2021-05-06 Impact factor: 53.440

5. Force interacts with macromolecular structure in activation of TGF-β.

Authors: Xianchi Dong; Bo Zhao; Roxana E Iacob; Jianghai Zhu; Adem C Koksal; Chafen Lu; John R Engen; Timothy A Springer
Journal: Nature Date: 2017-01-25 Impact factor: 49.962

6. LRP1 regulates remodeling of the extracellular matrix by fibroblasts.

Authors: Alban Gaultier; Margaret Hollister; Irene Reynolds; En-hui Hsieh; Steven L Gonias
Journal: Matrix Biol Date: 2009-08-20 Impact factor: 11.583

7. Progressive Pulmonary Fibrosis Is Caused by Elevated Mechanical Tension on Alveolar Stem Cells.

Authors: Huijuan Wu; Yuanyuan Yu; Huanwei Huang; Yucheng Hu; Siling Fu; Zheng Wang; Mengting Shi; Xi Zhao; Jie Yuan; Jiao Li; Xueyi Yang; Ennan Bin; Dong Wei; Hongbin Zhang; Jin Zhang; Chun Yang; Tao Cai; Huaping Dai; Jingyu Chen; Nan Tang
Journal: Cell Date: 2019-12-19 Impact factor: 41.582