Literature DB >> 25938587

Consensus comparative analysis of human embryonic stem cell-derived cardiomyocytes.

Shaohong Zhang1, Ellen Poon2, Dongqing Xie1, Kenneth R Boheler2, Ronald A Li2, Hau-San Wong3.   

Abstract

Global transcriptional analyses have been performed with human embryonic stem cells (hESC) derived cardiomyocytes (CMs) to identify molecules and pathways important for human CM differentiation, but variations in culture and profiling conditions have led to greatly divergent results among different studies. Consensus investigation to identify genes and gene sets enriched in multiple studies is important for revealing differential gene expression intrinsic to human CM differentiation independent of the above variables, but reliable methods of conducting such comparison are lacking. We examined differential gene expression between hESC and hESC-CMs from multiple microarray studies. For single gene analysis, we identified genes that were expressed at increased levels in hESC-CMs in seven datasets and which have not been previously highlighted. For gene set analysis, we developed a new algorithm, consensus comparative analysis (CSSCMP), capable of evaluating enrichment of gene sets from heterogeneous data sources. Based on both theoretical analysis and experimental validation, CSSCMP is more efficient and less susceptible to experimental variations than traditional methods. We applied CSSCMP to hESC-CM microarray data and revealed novel gene set enrichment (e.g., glucocorticoid stimulus), and also identified genes that might mediate this response. Our results provide important molecular information intrinsic to hESC-CM differentiation. Data and Matlab codes can be downloaded from S1 Data.

Entities:  

Mesh:

Year:  2015        PMID: 25938587      PMCID: PMC4418815          DOI: 10.1371/journal.pone.0125442

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Human embryonic stem cells (hESC) self-renew; their differentiation to the cardiac lineage represents a potentially unlimited source of cardiomyocytes (CMs) for therapies and as experimental models to investigate mechanisms involved in human cardiac development and for disease progression. A genome-wide characterization of the molecular phenotype of hESC-CMs is important for these applications. Microarray experiments have been performed by various groups and have shown that hESC-CMs expressed contractile genes, transcription factors, potassium channels and Ca handling genes that are commonly found in the heart [1-7]. In spite of this, there is remarkable divergence among these studies. [4] evaluated their list of 1311 genes upregulated in CM with those presented by [1], and by [2] and showed that only 33 genes were shared by all three studies. This divergence may be attributed to a number of experimental variables such as hESC strains, differentiation conditions, culture duration and microarray platform/thresholds used (Table 1). For instance, [3] and [2] generated CMs by non-directed, spontaneous embryoid body formation without addition of growth factors, while [7] performed stage-specific addition of growth factors including bFGF, BMP4, activin etc. to direct cardiac differentiation. In addition, 4 different hESC lines were used in the 6 studies; it has been shown that different hESC lines have predetermined preferences to become ventricular, atrial and pacemaker CMs with different electrophysiological properties [8]. It is to be expected that the variables described above would impact the transcriptome of CMs generated. A consensus comparative analysis from multiple studies would thus be invaluable to distinguish between factors/pathways crucial for CM differentiation and those that are reflections of experimental conditions.
Table 1

Summary of related studies and filtered differential expression conditions.

StudyStrainDifferentiation ProtocolAgeIsolation% YieldMethod and Parameter#Sig. Gene
[1]HES2END2 co-culture, no FBS12dMechanical dissectionUncertainp < 0.01; FC≥22608
[2]SA002Spontaneous EB formation< 22dMechanical dissectionUncertainSAM, FDR < 0.04;FC≥5530
[3]H9Spontaneous EB formation14dPercoll gradient40–45%FC≥25504
[4]HES3CM differentiation with transgenic line21dAntibiotic selection> 99%FDR < 0.05; FC≥21311
[5]SA002Spontaneous EB formation21dMechanical dissectionUncertainFC≥21781
[5]SA002Spontaneous EB formation49dMechanical dissectionUncertainFC≥21883
[7]HES2Directed differentiation with growth factor21dTransduction and sorting> 99%FC≥22896
Gene set analysis is more effective than single gene analysis in identifying consensus expression patterns across different data sets in general, and represents a recent and successful analysis tool family commonly adopted in bioinformatics studies [9-12]. These tools usually adopt a complete data matrix or a large ordered gene list as the input, and assess statistical significance based on multiple random permutations. While some groups have made their data matrices publicly available, most microarray papers involving hESC-CMs only provide lists of differentially expressed genes [1, 2, 4]. It is therefore difficult to perform gene set comparison across multiple studies using heterogeneous data sources (including both data matrices and lists of differentially expressed genes). In view of these challenges, we devised a novel algorithm to identify gene sets that are enriched in hESC-CMs relative to hESC in multiple studies. We showed that our new algorithm has improved properties compared to traditional methods, and we identified differential expression changes in gene sets that have not been previously reported.

Contribution of this paper

We are the first to perform consensus comparative investigation of hESC-CMs to identify genes/gene sets upregulated in hESC-CMs independent of experimental conditions. We have identified novel enrichment of genes and gene sets in hESC-CMs, and our results provide valuable information about the molecular program that is active in hESC-CMs. The main computational contribution of our work is the proposal of a new gene set analysis method, i.e., consensus comparative analysis (CSSCMP), to identify commonly enriched gene sets across multiple studies based on lists of differentially expressed genes (without data matrices). From both theoretical analysis and experimental validation, we show that our CSSCMP method has a number of desirable properties: (a) Capability to detect randomness in the input. (b) Improvement of efficiency through relaxing the condition of using a large number of random permutations commonly adopted by traditional gene set based analysis methods [9]. (c) Mitigation of the problem of gene set size dependence. (d) Integration of information from multiple heterogeneous data sources for improved analysis.

Related work

Transcriptomic profiling studies have been performed to characterize hESC-CMs and to identify gene regulatory mechanisms that control the differentiation of hESCs into CMs [1-6]. [1] assessed time-dependent gene expression patterns of hESCs differentiating towards CMs. [2] then identified genes and pathways that were upregulated in hESC-CM clusters compared to undifferentiated hESCs. [3] and [4] used CMs of higher purities (30–40%, > 99% respectively) to compare the transcriptome of CMs with hESC and fetal heart cells, while [7] compared ventricular hESC-CMs with fetal and adult CMs of the same lineage. A later study by [5] focused on the expression of ion channel and Ca 2+-handling genes in hESC-CM clusters. Most of the hESC-CM studies only provided lists of significantly differentially expressed genes [1, 2, 4], rather than the complete gene expression datasets. [4] and Synnergren et al. examined genes commonly upregulated in 2–4 studies, but gene set analysis has not been performed [13]. Gene set analysis methods are more effective in the search for consensus results than single gene analysis methods [9-12]. These tools can roughly be divided into two categories: (1) microarray data based methods, which in general access the full data matrices. Representative examples include GSEA [9], SAFE [14], SAM-GS [15];(2) significant gene list based methods, which utilize lists of significantly differentially expressed genes as input. Representative examples include DAVID [10], FuncAssociate [16], WebGestalt [17] and Bingo [18]. However, to our best knowledge, there are no effective tools that can identify differentially expressed gene sets from heterogeneous data sources which include a combination of full data matrices and gene lists with different thresholds for fold changes (FC).

Materials and Methods

Materials

We collected data from microarray studies [1–5, 7] as shown in Table 1. The first six data sets correspond to heterogeneous CM populations with different purity levels, while the last one [7] consists of purified hESC-CMs of the ventricular lineage. In view of the diverse nature of the data sets, our main focus is on non-lineage specific analysis of the gene expression patterns of hESC-CMs, instead of those associated with any particular chamber-specific lineage. For several studies [1, 2, 4], only lists of differentially expressed genes were available, and the corresponding methods and parameters adopted by the original authors are shown in the column ‘Method and Parameter’ in Table 1. For the other studies, FC thresholds were set to 2 in order to provide a uniform basis for comparison. We extracted two related gene set collections, named the general Homo sapiens gene sets on Biological Process (HSBP) and the subset from the work of British Heart Foundation-University College London on Biological Process (UCLBP), respectively, from the Gene Ontology and Gene Ontology Annotation databases, in the well-known.gmt file format. HSBP groups genes based on general biological processes and is most suitable for examining gene functions. UCLBP is composed of genes mainly related to heart development. We used version 1.1.2681 of the file gene ontology.1.0.obo (Time stamp: 06:03:2012 19:30, downloaded from the GO official website at http://www.geneontology.org/ontology/oboformat_1_0/gene ontology.1_0.obo). For the Homo Sapiens annotation file, we used version 1.225 of the file gene_association.goa human (Time stamp: 06:03:2012, downloaded from the Gene Ontology Annotation (UniProt-GOA) Database at ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/HUMAN/gene association.goa human.gz). The UCLBP gene set collection is constructed based on the work of British Heart Foundation-University College London (BHF-UCL) GO teams and their coworkers on the representation of heart development in GO [19], which is accessed from the file ftp://ftp.ebi.ac.uk/pub/databases/GO/GOA/bhf-ucl/gene_association.goa_bhf-ucl.gz. To use up-to-date gene ontology and annotation data, we construct these two gene set collections using a similar method as is adopted for the GSEA official MsigDB C5 gene sets (see http://www.broadinstitute.org/gsea/msigdb/collection_details.jsp#C5). Specifically, only entries associated with the following evidence codes were included: Inferred from Direct Assay (IDA), Inferred from Physical Interaction (IPI), Inferred from Mutant Phenotype (IMP), Inferred from Genetic Interaction (IGI), Inferred from Expression Pattern (IEP), Inferred from Sequence or Structural Similarity (ISS), and Traceable Author Statement (TAS). We removed gene sets with more than 500 genes or fewer than 15, to exclude very broad categories or very narrow ones, as suggested by the GSEA user guide [9]. Specifically, there are 1564 gene sets in the HSBP gene set collection and 966 ones in UCLBP. Note that the HSBP gene set collection is more general since it is related to most of the Homo Sapiens biological processes. On the other hand, the UCLBP gene set collection focuses on heart development [19] and therefore it is more specific than the former one. Through a closer inspection of the two gene set collections, we find that there are 915 gene sets in common in both gene set collections, and therefore UCLBP can roughly be viewed as a subset of HSBP.

Methods

Given D individual studies, we extracted a combined gene set ψ of all N genes in the different studies (The superscript is used to distinguish entities associated with the combined gene set from those of a specific gene set). Specifically, we constructed a N × D overall contingency matrix M with entries m as follows: Given a specific gene set ψ with N genes within the combined gene set ψ , we can extract the corresponding contingency matrix M from the overall contingency matrix. An intuitive method is to compute the counting score (CS) of the lower triangular entries of the matrix L = (M ) M (here the notation (M ) refers to the transpose of the matrix (M )) as follows The counting score reflects both the co-association of different study pairs and the number of up-regulated genes in each study. However, it is notable that it suffers from the problem of producing non-zero values for random contingency matrices, and its dependence on the matrix size, similar to problems well discussed for the Rand index [20] as a cluster validity measure [21-24]. For the j-th study, the estimated probability of a gene to be up-regulated based on the overall contingency matrix can be computed as Thus the expected value of CS corresponding to a N × D random contingency matrix M can be computed as where , are estimated up-regulation probabilities based on the random contingency matrix. The second step follows from the approximation of the expected value based on the estimated probabilities when N is large. We can observe from Eq (4) that the score value is not zero, and is correlated to the size of the gene set N . Motivated by the adjusted Rand index and related clustering measures proposed in [21, 23], which expresses the modified measures in the form of , where S and S represent the original score and the maximum possible score respectively, and represents the expected score value for random inputs, we propose an improved consensus comparative analysis (CSSCMP) score based on the contingency matrix. The maximum score value corresponding to the contingency matrix M can be readily found when all the entries are ones, i.e., all the entries of the matrix (M ) M equal N . Thus, the maximum score value is computed as Therefore, the consensus comparative analysis score can be computed as Compared to conventional gene set analysis methods, such as GSEA [9], our proposed CSSCMP method has a number of advantages in handling imperfect data prevalent in the studies of hESC-CMs: (1) CSSCMP only uses lists of significantly differentially expressed genes, thus it is readily applicable to the analysis of multiple studies since many studies only release their significant gene lists rather than the full microarray data; (2) CSSCMP does not require performing multiple random permutation trials, which is commonly used in traditional methods. In general, these kinds of random permutation trials require significant computation time (e.g., 1000 trials are commonly used in GSEA). As a result, our method improves computational efficiency; (3) The CSSCMP score value is close to zero with random input contingency matrices, which allows our approach to distinguish meaningful inputs from trivial ones. (4) CSSCMP is less sensitive to the size of gene sets, which is also an important concern in traditional gene set analysis methods. Verifications of the last two properties are presented as follows, and confirmed in experiments using both simulated and real data.

Proposition: Detection of randomness

CSSCMP is close to zero for a random input contingency matrix, and is less sensitive to the size of gene sets. The verification is straightforward. For a N × D random input contingency matrix M , when the number of genes in the random gene set N is large enough, we have The third step follows from the approximation of the expected value of CS based on the estimated up-regulation probabilities when N is large. Note that the result is less sensitive to the size of gene set N , since this factor is removed in the third step.

Results

Gene based consensus comparative analysis in hESC-CMs relative to hESCs

We first examined genes commonly upregulated in multiple individual studies. A pyramid chart for statistics of commonly enriched genes in hESC-CMs relative to hESCs in multiple studies is shown in Fig 1. Only a small number (i.e., 53) of genes are enriched in all studies while a large number (i.e., 9431) is found in at least one study. This implies that interpretation of the results at the level of gene sets is important besides the identification of individual genes. In this section we will focus on consensus comparative analysis in hESC-CMs relative to hESCs based on individual genes. Gene set based consensus comparative analysis will be performed in the following section.
Fig 1

Statistics of common genes in hESC-CMs relative to hESCs in multiple studies.

Genes uniformly enriched in hESC-CMs relative to hESCs in all studies

We first focused on genes that were uniformly enriched in hESC-CMs relative to hESCs, as listed in Table 2. Up-regulated genes included those known to be crucial for heart development/function such as transcription factors e.g., MEF2C and GATA4, contractile genes e.g., MYH7 and TNNC1 etc., as well as ion transport genes e.g., ATP2A2 and PLN etc. Interestingly, as shown in Table 3, 40% (22 out of 54) of our upregulated genes were not highlighted by the hESC-CM microarray studies examined.
Table 2

Common genes uniformly overexpressed in hESC-CMs compared to hESCs in all studies.

GENE SYMBOL ENSEMBL GENE ID GENE NAME
MEF2CENSG00000081189myocyte enhancer factor 2C
APOBEC2ENSG00000124701apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 2
COL21A1ENSG00000124749collagen, type XXI, alpha 1
TNNC1ENSG00000114854troponin C type 1 (slow)
IL6STENSG00000134352interleukin 6 signal transducer (gp130, oncostatin M receptor)
RRADENSG00000166592Ras-related associated with diabetes
TTNENSG00000155657titin
SYNPO2LENSG00000166317synaptopodin 2-like
TGFB2ENSG00000092969transforming growth factor, beta 2
EDNRAENSG00000151617endothelin receptor type A
KIFAP3ENSG00000075945kinesin-associated protein 3
GATA4ENSG00000136574GATA binding protein 4
LMOD1ENSG00000163431leiomodin 1 (smooth muscle)
PPP1R14CENSG00000198729protein phosphatase 1, regulatory (inhibitor) subunit 14C
MBENSG00000198125myoglobin
ACTA2ENSG00000107796actin, alpha 2, smooth muscle, aorta
MICAL2ENSG00000133816microtubule associated monoxygenase, calponin and LIM domain containing 2
MYH7ENSG00000092054myosin, heavy chain 7, cardiac muscle, beta
ACTN2ENSG00000077522actinin, alpha 2
MYH6ENSG00000197616myosin, heavy chain 6, cardiac muscle, alpha
FLNCENSG00000128591filamin C, gamma (actin binding protein 280)
TNNT2ENSG00000118194troponin T type 2 (cardiac)
CLIC5ENSG00000112782chloride intracellular channel 5
HSPB7ENSG00000173641heat shock 27kDa protein family, member 7 (cardiovascular)
SMPXENSG00000091482small muscle protein, X-linked
CTSBENSG00000164733cathepsin B
HSPB3ENSG00000169271heat shock 27kDa protein 3
PRNPENSG00000171867prion protein
NPPAENSG00000175206natriuretic peptide precursor A
ZNF436ENSG00000125945zinc finger protein 436
MYL7ENSG00000106631myosin, light chain 7, regulatory
MYL4ENSG00000198336myosin, light chain 4, alkali; atrial, embryonic
MYL3ENSG00000160808myosin, light chain 3, alkali; ventricular, skeletal, slow
COL2A1ENSG00000139219collagen, type II, alpha 1
MEIS1ENSG00000143995Meis homeobox 1
MSX2ENSG00000120149msh homeobox 2
PPP1R3CENSG00000119938protein phosphatase 1, regulatory (inhibitor) subunit 3C
MEIS2ENSG00000134138Meis homeobox 2
AGTENSG00000135744angiotensinogen (serpin peptidase inhibitor, clade A, member 8)
FBN2ENSG00000138829fibrillin 2
HRCENSG00000130528histidine rich calcium binding protein
TBX3ENSG00000135111T-box 3
CRIP2ENSG00000182809cysteine-rich protein 2
CSRP3ENSG00000129170cysteine and glycine-rich protein 3 (cardiac LIM protein)
TNNI1ENSG00000159173troponin I type 1 (skeletal, slow)
RASSF5ENSG00000136653Ras association (RalGDS/AF-6) domain family member 5
CDKN1AENSG00000124762cyclin-dependent kinase inhibitor 1A (p21, Cip1)
ATP2A2ENSG00000174437ATPase, Ca++ transporting, cardiac muscle, slow twitch 2
PKP2ENSG00000057294plakophilin 2
SVILENSG00000197321supervillin
PLNENSG00000198523phospholamban
RGS5ENSG00000143248regulator of G-protein signaling 5
BMP5ENSG00000112175bone morphogenetic protein 5
TMOD1ENSG00000136842tropomodulin 1
Table 3

Genes that are significantly upregulated for hESC-CMs in the seven data sets.

Only genes that have not been highlighted in the individual studies are shown. Genes that are enriched by more than 10-fold in hESC-CMs relative to both hESC and hESC-derived embryoid bodies are shown in bold. Fold changes are based on data from [4], who used purified CMs and who compared hESC-CM gene expression with both hESC and hESC-derived embryoid bodies.

GenesNameCM/ESCM/EB
JPH2 junctophilin 27231
COL21A1collagen, type XXI, alpha 1472
IL6STinterleukin 6 signal transducer (gp130, oncostatin M receptor)1615
KIFAP3kinesin-associated protein 3103
LMOD1leiomodin 1 (smooth muscle)64
MICAL2 microtubule associated monoxygenase, calponin and LIM domain containing 23622
FLNCfilamin C, gamma (actin binding protein 280)53
CTSBcathepsin B33
ZNF436zinc finger protein 436154
MEIS1Meis homeobox 1442
MSX2msh homeobox 21675
PPP1R3C protein phosphatase 1, regulatory (inhibitor) subunit 3C2310
MEIS2Meis homeobox 2442
AGTangiotensinogen (serpin peptidase inhibitor, clade A, member 8)524
RASSF5Ras association (RalGDS/AF-6) domain family member 5316
CDKN1Acyclin-dependent kinase inhibitor 1A (p21, Cip1)34
PKP2plakophilin 2165
SVILsupervillin105
RGS5 regulator of G-protein signaling 59511

Genes that are significantly upregulated for hESC-CMs in the seven data sets.

Only genes that have not been highlighted in the individual studies are shown. Genes that are enriched by more than 10-fold in hESC-CMs relative to both hESC and hESC-derived embryoid bodies are shown in bold. Fold changes are based on data from [4], who used purified CMs and who compared hESC-CM gene expression with both hESC and hESC-derived embryoid bodies. These genes were commonly upregulated in hESC-CMs independent of culture condition and differentiation protocol, and are likely to be important for early human CM differentiation. Of these, 6 genes show CM-specific expression and were 10-fold enriched in hESC-CMs relative to both undifferentiated hESC and mixed embryoid bodies culture [4]. This list included transcripts which are known to be important in the heart and whose presence in hESC-CMs has not been reported. JPN2 and RGS5 are such examples [25-27], and are likely to be important for controlling calcium handling and cardiac repolarization in hESC-CMs. In addition, we identified upregulation of genes with unknown roles in the heart, and they are MICAL2 and CPNE5. Both genes are strongly upregulated in hESC-CMs e.g. MICAL2 expression is 36 and 22 fold higher in hESC-CMs compared to hESCs and embryoid bodies respectively. In addition, both genes were also enriched by more than 8 fold in human fetal and adult CMs relative to hESCs and embryoid bodies. MICAL2 is a cytoskeletal protein involved in adhesion and actin polymerization [28]. CPNE5 encodes a poorly characterized Ca binding membrane protein [29]. It is unclear what roles they play in heart development and function and merits further attention.

Statistics of literature-curated marker genes of important biological processes in multiple studies

We next examined the enrichment of selected genes known to be important for cardiac functions such as contractile genes, cardiac transcription factors, Ca 2+ handling genes, and ion channels, and found significant variation among individual studies, as shown in Fig 2. Gene sets associated with heart development, contraction, Ca 2+ handling were uniformly upregulated in all studies examined, but the upregulation of genes within these gene sets were not uniform. 6 of the 9 contractile genes studied were enriched in all of the studies. By contrast, none of the ion channel genes were enriched in all seven studies. For instance, KCNE1 and KCNQ1 were enriched in only 3 out of 7 studies. Thus, in order to observe the enrichment of gene groups corresponding to different biological processes and functions, gene set analysis is required as an important complementary approach besides individual gene analysis.
Fig 2

Enrichment of representative genes of biological processes closely associated with heart development and function.

Enriched biological process categories for uniformly enriched genes in different numbers of studies

To further evaluate variability in gene- and gene-set based methods, we next assessed the gene ontology affiliations of genes uniformly enriched in multiple studies. Specifically, we identified genes that were significantly enriched in CMs in at least 7, 6, 5 and 4 studies respectively, and extracted their Gene Ontology annotation as shown in Fig 3. The top BP categories were development, morphogenesis, cell communication, metabolism, and signal transduction. The GO enrichment pattern was largely conserved irrespective of the number of studies used. This shows that gene set based analysis is less sensitive to variations in different studies than gene based analysis. The results also confirmed that the 7 studies were closely related.
Fig 3

Enriched GO BP categories for uniformly enriched genes in different numbers of studies: (a) 7 studies, (b) 6 studies; (c) 5 studies; (d) 4 studies.

Gene set based consensus comparative analysis on hESC-CMs

In this section, we verified the properties of our method on both simulated data and real data. Then, we presented the results of CSSCMP on hESC-CMs.

Verification of the properties of consensus comparative analysis

We first investigated our CSSCMP analysis method using random contingency matrices of 20 individual studies with various gene set sizes. As shown in Fig 4A and 4B, the CS score heavily depended on gene set sizes, while CSSCMP scores were consistently small and insensitive to the sizes. These observations suggested that CSSCMP was capable of detecting randomness and was more robust against the effect of gene set sizes, both of which were in agreement with the proposition in Eq (7).
Fig 4

Verification of the properties of consensus comparative analysis compared with random data.

(a) Plots of CS scores based on random contingency matrices of 20 individual studies with various gene set sizes; (b) Plots of CS scores based on random contingency matrices of 20 individual studies with various gene set sizes; (c) Mean of CSSCMP scores of top HSBP gene sets compared with the hESC-CM data and random data with different levels of variance; (d) Mean of CSSCMP scores of top UCLBP gene sets compared with the hESC-CM data and random data with different levels of variance. The CS score heavily depends on the gene set sizes, while CSSCMP scores are insensitive to the size of gene sets and consistently small under random data with different levels of variance.

Verification of the properties of consensus comparative analysis compared with random data.

(a) Plots of CS scores based on random contingency matrices of 20 individual studies with various gene set sizes; (b) Plots of CS scores based on random contingency matrices of 20 individual studies with various gene set sizes; (c) Mean of CSSCMP scores of top HSBP gene sets compared with the hESC-CM data and random data with different levels of variance; (d) Mean of CSSCMP scores of top UCLBP gene sets compared with the hESC-CM data and random data with different levels of variance. The CS score heavily depends on the gene set sizes, while CSSCMP scores are insensitive to the size of gene sets and consistently small under random data with different levels of variance. We next studied the CSSCMP scores with two extracted gene set collections (UCLBP and HSBP) on the hESC-CM data and on random data with different levels of variances. Specifically, we computed the CSSCMP scores with the hESC-CM data and random data with different levels of variance respectively, and computed the mean values of the top 10%, 20%, 30%, 40%, and 50% ones. These results are shown in Fig 4C and 4D, respectively. We have observed from the new experiments that both HSBP/UCLBP have much higher CSSCMP scores than those of random data. For example, the mean CSSCMP values of the top 10% gene sets are 0.22 and 0.25 for HSBP and UCLBP respectively, compared to a value of 0.03 for the random data. In general, the mean scores of random data with different levels of variance are close and significantly smaller. These results suggest that our CSSCMP score can readily separate meaningful data from random data, regardless of their variance. In addition, we ranked the gene sets in descending order of the corresponding scores, and plotted the sizes of these gene sets, as shown in Fig 5. For CS, top ranks were associated with high gene set sizes, while no such association existed for CSSCMP. These observations confirmed that CSSCMP was insensitive to the size of gene sets.
Fig 5

Plots of gene set sizes for two extracted gene set collections in descending order of different scores: (a)UCLBP:CS, (b)UCLBP:CSSCMP, (c)HSBP:CS, (d)HSBP:CSSCMP.

The CS score heavily depends on the gene set sizes, while CSSCMP scores are insensitive to the sizes.

Plots of gene set sizes for two extracted gene set collections in descending order of different scores: (a)UCLBP:CS, (b)UCLBP:CSSCMP, (c)HSBP:CS, (d)HSBP:CSSCMP.

The CS score heavily depends on the gene set sizes, while CSSCMP scores are insensitive to the sizes.

Gene set based consensus comparative analysis on hESC-CMs

Based on 7 datasets from 6 individual hESC-CM studies, we ordered the UCLBP and HSBP gene sets according to their CSSCMP scores to assess enrichment of gene sets in hESC-CMs relative to hESCs. For comparison, we also employed gene set enrichment analysis (GSEA) [9] on four studies with full data matrices to identify significantly enriched gene sets for each individual study [3, 5, 7]. For the HSBP gene set collection, the top 100 enriched gene sets are listed in Table 4, and details are provided in S1 and S2 Tables. Our CSSCMP method generated results largely consistent with those of GSEA. For instance, the 17 gene sets with the top CSSCMP scores were considered enriched in all four individual studies by GSEA (S1 Table). Conversely, none of the gene sets with the lowest 267 CSSCMP scores (under 0.014989) were considered enriched by GSEA in any of the four individual datasets (S2 Table). Moreover, gene sets with the largest CSSCMP scores included those known to be important for cardiac differentiation and function, e.g., ventricular-cardiac-muscle-tissue-morphogenesis (GO:0055010), myofibril-assembly (GO:0030239) and cardiac- muscle-tissue-morphogenesis (GO:0055008). This indicated that a number of gene sets were uniformly enriched in hESC-CMs relative to hESCs regardless of diverse experimental conditions, and our CSSCMP method generated results that were in accordance to those generated by GSEA and were biologically relevant.
Table 4

The top 100 enriched gene sets in the HSBP gene set collection identified by CSSCMP scores, with comparison to the GSEA results in four studies: study S1:purified hESC-CM (14 days) [3], S2: hESC-CM cluster (21 days) [5], S3: hESC-CM cluster (49 days) [5], and S4: purified hESC-derived ventricular (21 days) [7].

The ’1’s under the four studies mean that the corresponding gene sets are enriched, while ’0’s mean not enriched. Gene set names are represented with GO term IDs. Details can be found in supplementary files.

Gene Set NameCSSCMP ScorePValueS1S2S3S4
GO:00550100.5433838.65E-491111
GO:00302390.4892381.14E-381111
GO:00550080.4882074.81E-541111
GO:00486440.4681824.12E-541111
GO:00604150.4681825.10E-471111
GO:00600470.4563011.35E-291111
GO:00069410.4491981.63E-441111
GO:00300490.4304823.83E-511111
GO:00332750.4304822.31E-501111
GO:00032080.4214295.94E-481111
GO:00310320.4209561.74E-331111
GO:00487380.4159634.86E-611111
GO:00550070.4114746.82E-301111
GO:00434620.3915211.16E-251111
GO:00350510.3471743.50E-331111
GO:00300480.3384975.50E-391111
GO:00550020.3279733.91E-291111
GO:00481460.3201884.09E-190001
GO:00604110.3045468.66E-120100
GO:00080160.3045466.78E-311111
GO:00605370.2949461.36E-521111
GO:00032810.2901071.07E-111001
GO:00550010.286673.03E-251111
GO:00435880.2852951.02E-100110
GO:00032790.2717981.26E-130101
GO:00105740.2690511.95E-101111
GO:00487470.2660432.92E-111111
GO:00030070.2615147.47E-371111
GO:00328350.261232.16E-151101
GO:00301980.2600279.77E-221111
GO:00430620.2600273.02E-211111
GO:00069420.2582222.36E-100111
GO:00075170.257733.50E-481111
GO:00107180.2571789.58E-110111
GO:00069370.2569464.92E-200111
GO:00485460.2547195.10E-090000
GO:00456690.2525071.00E-110101
GO:20003790.2510031.16E-100001
GO:00069360.2455634.83E-571111
GO:00344460.2374672.30E-070000
GO:00457780.2364267.10E-121111
GO:00426980.2334861.01E-071001
GO:00487420.2333161.12E-101101
GO:00712770.2292392.53E-080001
GO:00019370.2286811.93E-081111
GO:00030120.2285329.85E-501111
GO:00610610.2207088.53E-521111
GO:00107170.2205474.80E-110101
GO:00511530.215912.33E-121111
GO:00149100.2143761.18E-060111
GO:00513840.2126515.75E-060110
GO:00031430.2121613.24E-081111
GO:00305010.2108057.57E-081101
GO:00485660.2098941.37E-060000
GO:00480100.2096831.72E-060111
GO:00515920.2095661.67E-140111
GO:00075070.207284.53E-401111
GO:00487060.2058832.00E-090110
GO:00485650.2048521.39E-110001
GO:00105950.2029564.50E-110001
GO:00487050.1998675.69E-110000
GO:00481450.1975533.36E-110001
GO:00426920.1954637.82E-241111
GO:00350500.1947761.33E-070101
GO:00105940.1901323.85E-130001
GO:00458840.1900918.39E-050100
GO:00714070.1900916.78E-051111
GO:00319600.189842.93E-050110
GO:00603490.1887853.62E-060000
GO:00162020.1886941.14E-100101
GO:00019470.1878353.93E-060101
GO:00025760.1853841.10E-191111
GO:00031580.1818193.36E-051001
GO:00604850.1811851.49E-120001
GO:00020620.18011.28E-050101
GO:00080150.180065.05E-251111
GO:00303240.1788112.94E-061001
GO:00512790.1788111.28E-040001
GO:00715600.1773866.37E-051111
GO:00508800.1770063.62E-040001
GO:00105960.1758031.53E-040011
GO:00030730.1754275.00E-071110
GO:00018370.1751344.00E-060001
GO:00506800.1743959.11E-100001
GO:00305000.1737975.93E-080101
GO:00303230.1735213.15E-061001
GO:00512160.1733085.39E-090001
GO:00019360.1731291.18E-100111
GO:00613710.1727952.47E-050101
GO:00603480.1708184.62E-070000
GO:20003770.1707624.32E-070000
GO:00031510.1697871.94E-050000
GO:00016490.167384.68E-060001
GO:00098870.1664933.73E-401111
GO:00507290.1656221.88E-051011
GO:00091870.1653852.98E-080111
GO:00303260.1642334.52E-050001
GO:00606880.1640875.10E-040000
GO:00069390.1635092.56E-040000
GO:00069700.1630191.87E-030000

The top 100 enriched gene sets in the HSBP gene set collection identified by CSSCMP scores, with comparison to the GSEA results in four studies: study S1:purified hESC-CM (14 days) [3], S2: hESC-CM cluster (21 days) [5], S3: hESC-CM cluster (49 days) [5], and S4: purified hESC-derived ventricular (21 days) [7].

The ’1’s under the four studies mean that the corresponding gene sets are enriched, while ’0’s mean not enriched. Gene set names are represented with GO term IDs. Details can be found in supplementary files. In addition, CSSCMP identified enrichment of potentially important gene sets that were not detected by GSEA based on individual studies. Examples included positive-regulation-of-reactive-oxygen-species-metabolic-process (GO:2000379) and response-to-glucocortic-stimulus (GO:0051384) etc., as shown in Tables 5 and 6. Reactive oxygen species is important for cardiac differentiation of hESCs [30] but surprisingly it was not considered enriched by GSEA in any of the individual studies. Response-to-glucocorticoid- stimulus (GO:0051384) was enriched in two out of four datasets, but its role in hESC cardiac differentiation is unclear and requires further attention. Examination of genes associated with glucocorticoid-stimulus showed that ADAM9, AQP1, GOT1, ISL1, SLIT2 and SLIT3 were significantly upregulated in more than 4 of the 7 datasets and may be responsible for mediating the effect of this stimulus. Moreover, false positive results can arise from individual studies partly as a reflection of the specific conditions used in the experiment, and may not represent the biological entities examined. For instance, GSEA examination of the Cao et al. data set showed that genes involved in complement-activation (GO:0006956) was very significantly enriched in hESC-CMs. However, CSSCMP of four datasets showed that this gene set was only enriched in this single study, with a CSSCMP rank of 1349 and score of 0.012251, which is non-significant. By considering multiple datasets, false positives related to specific biological conditions may be reduced. More detailed results of different ranking between CSSCMP and results on GSEA with individual studies are listed in S3 Table.
Table 5

Selected gene sets enriched in hESC-CMs relative to hESCs.

CSSCMP score and PVal were calculated based on seven data sets. For comparison, GSEA results in four studies are shown: study S1:purified hESC-CM (14 days) [3], S2: hESC-CM cluster (21 days) [5], S3: hESC-CM cluster (49 days) [5], and S4: purified hESC-derived ventricular (21 days) [7]. The ’1’s under the four studies mean that the corresponding gene sets are enriched, while ’0’s mean not enriched. Details can be found in the supplementary files.

Gene Set NameScorePValS1S2S3S4
response-to-glucocorticoid-stimulus0.215.7e-0060110
positive-regulation-of-reactive-oxygen-species-metabolic-process0.251.2e-0100001
Table 6

Genes affiliated with “response-to-glucocorticoid-stimulus”.

Fold changes in all seven data sets are also shown. S1-4 are defined as in (A). S5:hESC-CM cluster (12 days) [1], S6:hESC-CM cluster (within 22 days) [2], S7:purified hESC-CM (22 days) [4]. Non-significant changes are shown as ’0’.

TargetIDS1S2S3S4S5S6S7
ADAM94.13.65.903.600
AQP112.011.414.22.710420
CALCR0000000
GHRHR0000000
GOT102.73.83002.8
IL100000000
IL1RN0000000
IL6002.43.9000
ISL18.62.702.87.8430
NR3C100000039.3
PDCD70000000
REST0000000
SLIT221.36.292.8000
SLIT316.111.41333.512.6390
TNF0002.2000
UBE2L30000000

Selected gene sets enriched in hESC-CMs relative to hESCs.

CSSCMP score and PVal were calculated based on seven data sets. For comparison, GSEA results in four studies are shown: study S1:purified hESC-CM (14 days) [3], S2: hESC-CM cluster (21 days) [5], S3: hESC-CM cluster (49 days) [5], and S4: purified hESC-derived ventricular (21 days) [7]. The ’1’s under the four studies mean that the corresponding gene sets are enriched, while ’0’s mean not enriched. Details can be found in the supplementary files.

Genes affiliated with “response-to-glucocorticoid-stimulus”.

Fold changes in all seven data sets are also shown. S1-4 are defined as in (A). S5:hESC-CM cluster (12 days) [1], S6:hESC-CM cluster (within 22 days) [2], S7:purified hESC-CM (22 days) [4]. Non-significant changes are shown as ’0’. To further compare the performance of CSSCMP and GSEA, we also identified unrelated gene sets that bear no obvious relationship to heart development and function (e.g. skin development, platelet degranulation) among the top gene sets identified by CSSCMP with those identified by GSEA of four individual studies (see S4 Table). Fig 6 shows the number of unrelated gene sets among the top 40 ones in each study. CSSCMP identified the smallest number of unrelated gene sets compared to GSEA of the individual studies. Importantly, these unrelated gene sets identified by GSEA reflect the purity and biological properties of the samples used in the individual studies. The samples used in Cao et al. [3] (besides Poon et al. [7]) have the highest purity and this study has a smaller number of unrelated gene sets than Jane et al. [5]. Poon et al. [7] used lentiviral selection to isolate hESC-VCMs, and consequently have a large number of gene sets related to inflammation, such as positive-regulation-of-cytokine-production(GO:0001819), among its top 40 gene sets. In conclusion, these show that our CSSCMP is superior in its ability to avoid false positive gene sets and are less sensitive to sample heterogeneity.
Fig 6

Proportion of unrelated HSBP gene sets identified by CSSCMP and GSEA with individual studies: S1:purified hESC-CM (14 days) [3], S2: hESC-CM cluster (21 days) [5], S3: hESC-CM cluster (49 days) [5], and S4: purified hESC-derived ventricular (21 days) [7].

CSSCMP identified the smallest proportion of unrelated gene sets.

Proportion of unrelated HSBP gene sets identified by CSSCMP and GSEA with individual studies: S1:purified hESC-CM (14 days) [3], S2: hESC-CM cluster (21 days) [5], S3: hESC-CM cluster (49 days) [5], and S4: purified hESC-derived ventricular (21 days) [7].

CSSCMP identified the smallest proportion of unrelated gene sets. For the UCLBP gene set collection, the top 100 sets are listed in Table 7, and details are provided in S5 Table. Scores of the members of the whole gene set list are provided in S6 Table. Analysis with the UCLBP generated similar results as the HSBP gene set collection. As shown in Fig 7, a large proportion of the top gene sets were the same in both gene set collections, and this proportion was significantly larger than the average ratio for the two complete gene set collections (i.e., 915/1564, plotted as a horizontal line). Specifically, 9 of the top 10 gene sets for two gene set collections were the same. This showed that most of the enriched gene sets of HSBP came from UCLBP, which is highly related to biological processes associated with heart development.
Table 7

The top 100 enriched gene sets in the UCLBP gene set collection identified by CSSCMP scores, with comparison to the GSEA results in four studies: study S1:purified hESC-CM (14 days) [3], S2: hESC-CM cluster (21 days) [5], S3: hESC-CM cluster (49 days) [5], and S4: purified hESC-derived ventricular (21 days) [7].

The ’1’s under the four studies mean that the corresponding gene sets are enriched, while ’0’s mean not enriched. Gene set names are represented with GO term IDs. Details can be found in supplementary files.

Gene Set NameCSSCMP ScorePValueS1S2S3S4
GO:00550100.5377437.50E-281111
GO:00550080.480291.85E-271111
GO:00302390.476236.75E-191111
GO:00310320.476231.51E-171111
GO:00486440.4594292.37E-281111
GO:00604150.4594299.70E-251111
GO:00600470.4495851.66E-151111
GO:00069410.4423954.89E-241111
GO:00300490.4299442.57E-271111
GO:00332750.4299442.64E-281111
GO:00032080.4244622.70E-231111
GO:00550070.4171546.84E-171111
GO:00487380.4121497.38E-311111
GO:00300480.3980174.71E-241111
GO:00350510.352521.30E-161111
GO:00328350.3503638.60E-101111
GO:00080160.3404151.35E-191111
GO:00481460.3283222.83E-101001
GO:00301980.3148361.50E-121111
GO:00430620.3148367.10E-141111
GO:00515920.2983923.85E-110111
GO:00604110.2959564.71E-060100
GO:00550010.2904751.63E-111111
GO:00605370.289178.08E-241111
GO:00456690.2782941.92E-060111
GO:00487470.2782942.18E-051110
GO:20003790.2711884.41E-060001
GO:00457780.2674971.57E-061111
GO:00487420.2672352.25E-061111
GO:00069420.2633891.16E-051111
GO:00030070.2607641.82E-171111
GO:00069370.2607261.21E-100111
GO:00511530.2600231.19E-071111
GO:00032790.2591922.01E-061101
GO:00426980.2545422.62E-041101
GO:00075170.2532396.71E-241111
GO:00487060.2508875.46E-050111
GO:00069360.2458041.19E-271111
GO:00300290.2365881.63E-161111
GO:00481450.2338762.57E-060001
GO:00610610.2310613.45E-251111
GO:00305010.2306931.91E-041111
GO:00315890.228447.70E-071111
GO:00308550.2266685.74E-081111
GO:00162020.2248651.75E-060111
GO:00031430.2234811.10E-041111
GO:00487050.2225369.56E-060000
GO:00025760.2194841.70E-101111
GO:00085440.2188112.53E-071111
GO:00513840.2179991.79E-030110
GO:00426920.217234.88E-121111
GO:20003770.2157519.23E-050000
GO:00075070.213058.17E-181111
GO:00485650.2128233.06E-040001
GO:00149100.2120612.74E-030111
GO:00350500.2110233.76E-041111
GO:00071600.2082551.13E-041111
GO:00098870.207822.23E-201111
GO:00098880.2028412.64E-281111
GO:00107170.2027187.85E-040001
GO:00303260.2015561.91E-030001
GO:00105950.2007898.13E-050001
GO:00030730.1988851.09E-031111
GO:00019470.1960742.69E-031101
GO:00303240.1940444.45E-031000
GO:00031580.193791.09E-021001
GO:00604850.1925941.92E-050101
GO:00512160.1904529.34E-040001
GO:00480100.1887669.72E-030111
GO:00603930.1887668.48E-030001
GO:00430090.1859248.15E-060111
GO:00305000.1856341.53E-031111
GO:00712410.1854162.74E-030110
GO:00303230.1845357.77E-031000
GO:00300360.1842868.25E-071111
GO:00080150.1837761.40E-091111
GO:00215370.1830238.38E-041111
GO:00068870.1827861.68E-071110
GO:00720730.1823712.58E-020000
GO:00018220.1818637.88E-061111
GO:00507770.1810271.07E-021111
GO:00071790.1794648.63E-040101
GO:00105940.1786341.27E-040001
GO:00031510.1778037.45E-030000
GO:00485920.1778036.83E-031110
GO:00613710.1769338.52E-030101
GO:00456670.1751931.11E-030101
GO:00424760.1741491.26E-020110
GO:00485980.1734761.13E-060001
GO:00605410.1734531.25E-021000
GO:00506800.1712787.10E-030001
GO:00000790.1703597.84E-031111
GO:00720010.1702432.50E-051111
GO:00353030.1694981.11E-020101
GO:00219530.1692774.56E-020110
GO:00302780.1690061.52E-041101
GO:00016540.1686687.37E-041101
GO:00486460.1684651.11E-101111
GO:00519240.1679651.07E-041111
GO:00487620.1678852.07E-030101
Fig 7

Proportion of common top gene sets in both gene set collections with respect to those in HSBP.

The average ratio for the two complete gene set collections (915/1564) is plotted as a horizontal line.

Proportion of common top gene sets in both gene set collections with respect to those in HSBP.

The average ratio for the two complete gene set collections (915/1564) is plotted as a horizontal line.

The top 100 enriched gene sets in the UCLBP gene set collection identified by CSSCMP scores, with comparison to the GSEA results in four studies: study S1:purified hESC-CM (14 days) [3], S2: hESC-CM cluster (21 days) [5], S3: hESC-CM cluster (49 days) [5], and S4: purified hESC-derived ventricular (21 days) [7].

The ’1’s under the four studies mean that the corresponding gene sets are enriched, while ’0’s mean not enriched. Gene set names are represented with GO term IDs. Details can be found in supplementary files.

Discussion

Application of hESC-CM for drug discovery and transplantation requires a thorough molecular characterization of these cells, but this is compromised by variations in experimental conditions among different studies. Conventional methods are unsuitable for consensus analysis of hESC-CM microarray data. To bridge this gap, we propose a new consensus comparison analysis approach, CSSCMP, and identified novel changes in genes and gene sets that occur in hESC-CM irrespective of different experimental variables. Based on the consensus information of different individual studies, our proposed CSSCMP approach has a number of advantages: (1) detection of randomness in the input; (2) improvement of efficiency; (3) mitigation of the problem of gene set size dependence; and (4) integration of information from multiple heterogeneous data sources. The current study points to a number of important future research directions from both computational and biological perspectives. From a computational perspective, an interesting improvement of the current approach is to replace the current binary matrix entries with real values for each individual study. From a biological perspective, a potential extension of the present work is to study the interaction between the identified gene sets and microRNAs.

Top 100 enriched gene sets based on CSSCMP scores on HSBP gene set collection.

(XLS) Click here for additional data file.

Enriched gene sets based on CSSCMP scores on HSBP gene set collection.

(XLS) Click here for additional data file.

Detailed results of different ranking between CSSCMP and results on GSEA with individual studies.

(XLS) Click here for additional data file.

Top 40 enriched gene sets identified by CSSCMP and GSEA in individual studies.

Unrelated gene sets are highlighted in red. (XLS) Click here for additional data file.

Top 100 enriched gene sets based on CSSCMP scores on UCLBP gene set collection.

(XLS) Click here for additional data file.

Enriched gene sets based on CSSCMP scores on UCLBP gene set collection.

(XLS) Click here for additional data file.

Data and implementation codes in Matlab.

(ZIP) Click here for additional data file.
  25 in total

1.  Characterizing gene sets with FuncAssociate.

Authors:  Gabriel F Berriz; Oliver D King; Barbara Bryant; Chris Sander; Frederick P Roth
Journal:  Bioinformatics       Date:  2003-12-12       Impact factor: 6.937

Review 2.  Copines: a ubiquitous family of Ca(2+)-dependent phospholipid-binding proteins.

Authors:  J L Tomsig; C E Creutz
Journal:  Cell Mol Life Sci       Date:  2002-09       Impact factor: 9.261

3.  Global transcriptional profiling reveals similarities and differences between human stem cell-derived cardiomyocyte clusters and heart tissue.

Authors:  Jane Synnergren; Caroline Améen; Andreas Jansson; Peter Sartipy
Journal:  Physiol Genomics       Date:  2011-12-13       Impact factor: 3.107

4.  Significance analysis of functional categories in gene expression studies: a structured permutation approach.

Authors:  William T Barry; Andrew B Nobel; Fred A Wright
Journal:  Bioinformatics       Date:  2005-01-12       Impact factor: 6.937

Review 5.  Gene set enrichment analysis: performance evaluation and usage guidelines.

Authors:  Jui-Hung Hung; Tun-Hsiang Yang; Zhenjun Hu; Zhiping Weng; Charles DeLisi
Journal:  Brief Bioinform       Date:  2011-09-07       Impact factor: 11.622

6.  Expression of Crip2, a LIM-domain-only protein, in the mouse cardiovascular system under physiological and pathological conditions.

Authors:  Tzu-Chieh Wei; Hui-Yu Lin; Cheng-Chieh Lu; Chun-Ming Chen; Li-Ru You
Journal:  Gene Expr Patterns       Date:  2011-05-13       Impact factor: 1.224

7.  Role of reactive oxygen species and phosphatidylinositol 3-kinase in cardiomyocyte differentiation of embryonic stem cells.

Authors:  H Sauer; G Rahimi; J Hescheler; M Wartenberg
Journal:  FEBS Lett       Date:  2000-07-07       Impact factor: 4.124

8.  The representation of heart development in the gene ontology.

Authors:  Varsha K Khodiyar; David P Hill; Doug Howe; Tanya Z Berardini; Susan Tweedie; Philippa J Talmud; Ross Breckenridge; Shoumo Bhattarcharya; Paul Riley; Peter Scambler; Ruth C Lovering
Journal:  Dev Biol       Date:  2011-03-17       Impact factor: 3.582

9.  Distinct roles of microRNA-1 and -499 in ventricular specification and functional maturation of human embryonic stem cell-derived cardiomyocytes.

Authors:  Ji-Dong Fu; Stephanie N Rushing; Deborah K Lieu; Camie W Chan; Chi-Wing Kong; Lin Geng; Kitchener D Wilson; Nipavan Chiamvimonvat; Kenneth R Boheler; Joseph C Wu; Gordon Keller; Roger J Hajjar; Ronald A Li
Journal:  PLoS One       Date:  2011-11-16       Impact factor: 3.240

10.  Transcriptome-guided functional analyses reveal novel biological properties and regulatory hierarchy of human embryonic stem cell-derived ventricular cardiomyocytes crucial for maturation.

Authors:  Ellen Poon; Bin Yan; Shaohong Zhang; Stephanie Rushing; Wendy Keung; Lihuan Ren; Deborah K Lieu; Lin Geng; Chi-Wing Kong; Jiaxian Wang; Hau San Wong; Kenneth R Boheler; Ronald A Li
Journal:  PLoS One       Date:  2013-10-21       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.