Literature DB >> 34771686

Gene Expression Signature Associated with Clinical Outcome in ALK-Positive Anaplastic Large Cell Lymphoma.

Camille Daugrois1,2,3, Chloé Bessiere4, Sébastien Dejean5, Véronique Anton-Leberre6, Thérèse Commes4, Stephane Pyronnet1,2,3, Pierre Brousset1,2,3, Estelle Espinos1,2,3, Laurence Brugiere7, Fabienne Meggetto1,2,3, Laurence Lamant1,2,3.   

Abstract

Anaplastic large cell lymphomas associated with ALK translocation have a good outcome after CHOP treatment; however, the 2-year relapse rate remains at 30%. Microarray gene-expression profiling of 48 samples obtained at diagnosis was used to identify 47 genes that were differentially expressed between patients with early relapse/progression and no relapse. In the relapsing group, the most significant overrepresented genes were related to the regulation of the immune response and T-cell activation while those in the non-relapsing group were involved in the extracellular matrix. Fluidigm technology gave concordant results for 29 genes, of which FN1, FAM179A, and SLC40A1 had the strongest predictive power after logistic regression and two classification algorithms. In parallel with 39 samples, we used a Kallisto/Sleuth pipeline to analyze RNA sequencing data and identified 20 genes common to the 28 genes validated by Fluidigm technology-notably, the FAM179A and FN1 genes. Interestingly, FN1 also belongs to the gene signature predicting longer survival in diffuse large B-cell lymphomas treated with CHOP. Thus, our molecular signatures indicate that the FN1 gene, a matrix key regulator, might also be involved in the prognosis and the therapeutic response in anaplastic lymphomas.

Entities:  

Keywords:  ALK+ ALCL; clinical outcome; predictive signature; relapse

Year:  2021        PMID: 34771686      PMCID: PMC8582782          DOI: 10.3390/cancers13215523

Source DB:  PubMed          Journal:  Cancers (Basel)        ISSN: 2072-6694            Impact factor:   6.639


1. Introduction

Anaplastic large cell lymphoma (ALCL) is a rare type of T-cell lymphomas, accounting for approximately 3% of adult non-Hodgkin lymphomas and 10 to 20% of childhood lymphomas [1]. Systemic ALK-positive ALCLs (ALK+ ALCL), associated with the translocation of the Anaplastic Lymphoma Kinase (ALK) oncogene, are considered a distinct entity in the WHO classification [1,2]. Chemotherapy treatments are based on cyclophosphamide, vinca-alkaloids, doxorubicin, and corticosteroids in both adults and children, and high-dose methotrexate in children. ALK+ ALCL tumors have a better outcome than other aggressive non-Hodgkin lymphomas, with a 5-year overall survival (OS) rate of 70% for adults and >90% for children [3,4,5,6,7,8,9,10]; however, the 2-year relapse rate remains at 30% [3,4,5,6,7,8,10,11]. To develop patient-tailored therapy strategies, we first need to be able to stratify patients according to risk factors. Several prognostic factors have been recently described for paediatric ALK+ ALCLs, including the detection of minimal disseminated disease (MDD) [12], in bone marrow or blood combined with antibody titers against ALK [13,14,15,16]. The histological subtype variant (versus the common morphology) is also associated with the prognosis in ALK+ ALCLs, at least in children [17]. However, the stratification of patients according to these prognostic factors has yet to be validated in randomized trials. We profiled gene expression in pre-treatment biopsies from non-relapsing and relapsing patients with ALK+ ALCL to provide an additional indicator that could help to identify patients with a high risk of relapse and those of low risk who could benefit from a therapy reduction. Several techniques were used to identify differentially expressed genes, i.e., micro-arrays and RNA-sequencing. Then, Fluidgim technology and the Kallisto/Sleuth pipeline helped us to cross-validate candidate genes.

2. Materials and Methods

2.1. Patient Characteristics and Tumor Samples

The diagnosis of ALK+ ALCL was based on morphologic and phenotypic criteria, as described in the 2001 and 2008 WHO classifications [1,2]. Histopathological and immunostaining results were reviewed by a national (the French Lymphopath Network) or international panel of pathologists [17]. Only cases with at least 50% lymph node involvement, assessed by CD30 staining frozen biopsies, and good RNA integrity (≥7) were selected from our tumor bank. The cohort consisted of 48 systemic ALK+ ALCL tumor samples obtained at the time of diagnosis between 1994 and 2009 (Table 1 and Table S1). The median follow-up was 58 months (4.8 years). Eighteen additional cases of systemic ALK+ ALCL with available frozen material at the time of the diagnosis were retrieved from our tumor bank and used as an independent validation cohort. The patients were all treated with intensive chemotherapy, most of them according to the ALCL99 protocol and stratified on clinical factors [18]. Others were treated according to malignant histiocytosis protocols (HM89 and HM91) [19] or with ACVBP (doxorubicin, cyclophosphamide, vindesine, bleomycin, and prednisone). Patient samples were obtained after informed consent in accordance with the Declaration of Helsinki, and approval was received from the relevant ethics committees. All samples were stored at the «CRB Cancer des Hôpitaux de Toulouse» collection. In accordance with French law, the CRB cancer collection has been declared to the Ministry of Higher Education and Research (DC 2009-989) and a transfer agreement has been obtained (AC-2008-820) after approbation by ethical committees. Clinical and biological annotations of the samples have been declared to the CNIL (Comité National Informatique et Libertés).
Table 1

Clinical and pathological characteristics of patients and univariate analysis.

Non_Relapsing Group n = 22Relapsing Group n = 26
Cox Univariate Analysis
Characteristics n % n %RR95% CIp-Value Likehood Ratio
Gender 0.760.34–1.690.51
Male1568.21661.5
Female731.81038.5
Age (years) 0.51340.23–1.320.09302
Median15 10
Range6–44 2–50
St_Jude Stage †† 1.4050.51–3.840.49
I–II418.2519.2
III–IV940.91661.5
Ann Arbor Stage †† 1.920.80–4.610.1266
I–II1150.0726.9
III–IV1150.01869.2
IPI score †† 2.6761.002–7.150.04183
0–11150.0623.1
2–3522.71246.2
LDH †† 4.461.76–11.20.00422
<2 × ULN2090.91557.7
≥2 × ULN14.5726.9
Morphological subtype 1.930.86–4.350.1018
Common Type1359.1934.6
SC/LH940.91765.4
Fusion partner 1.320.13–3.20.6934
NPM2090.92492.3
Others29.127.7
Peripheral lymph nodes ††
No14.51973.1
Yes1150.000.0
Mediastinal involvement †† 1.030.42–2.550.9411
No522.7934.6
Yes731.81038.5
Visceral involvement (spleen. liver or lung involvement) 2.1390.98–4.670.05414
No1568.21142.3
Yes731.81557.7
Spleen involvement ††
No1881.82076.9
Yes418.2519.2
Liver involvement ††
No2090.91973.1
Yes29.1623.1
Lung involvement ††
No1881.81453.8
Yes418.21142.3
Other Visceral involvement ††
No1777.31350.0
Yes522.71350.0
Skin lesion †† 1.110.46–2.690.8131
No1881.81765.4
Yes418.2726.9
Clinical high risk group †† (spleen or/and liver or/and lung or/and mediastinal involvement or/and skin lesions) 1.130.45–2.830.7843
No418.2623.1
Yes1254.52076.9
Bone lesions †† 0.960.33–2.790.9344
No1986.42180.8
Yes313.6415.4
Bone marrow involvement †† 1.0620.41–2.830.9049
No1777.32180.8
Yes313.6415.4
CNS involvement †† 1.1720.16–8.680.8793
No2195.52492.3
Yes14.513.8
Soft tissue mass †† 2.530.594–10.780.2676
No 2195.52388.5
Yes14.527.7
CD3 positivity †† 0.730.27–1.970.53
Negative1463.61973.1
Positive627.3519.2
MDD †† 10.231.34–78.020.001735
Negative627.313.8
Positive313.61765.4

Abbreviations: IPI, international prognostic index; LDH, lactate dehydrogenase; Visceral involvement: lung, liver, spleen; CNS, central nervous system; MDD, minimal disseminated disease; PFS, progression free survival; CI, confidence interval; p, p value; RR, Relative Risk. †: groups defined by the following criteria: ≥ or < median age (12.5 years). ††: Missing Data: St-Jude Stage n = 14, Ann Arbor Stage n = 1, IPI score n = 14, LDH n = 5, peripheral lymph nodes n = 17, mediastinum n = 17, spleen n = 1, liver n = 1, lungs n = 1, other visceral involvement n = 1, skin lesions n = 2, clinical high risk group n = 6, bone lesions n = 1, bone marrow involvement n = 3, CNS involvement n = 1, soft tissue mass n = 1, CD3 n = 4, MDD n = 21.

2.2. Microarrays

Two µg of total RNA from 48 samples were used for hybridization to HG-U133Plus 2.0 GeneChips (54,675 probe sets; Affymetrix, Santa Clara, CA, USA), as previously reported [20]. For each outcome group, gene expression data were extracted and normalized using the GCRMA method [21,22] with the gcrma package for Bioconductor 3.14 (http://bioconductor.org, accessed on 26 September 2021). Then, the data were filtered (using the genefilter package) to eliminate probe sets whose expression values were too low and that could therefore be difficult to reproduce using very sensitive methods such as quantitative RT-PCR (RT-qPCR) [23]. Thus, only probes with normalized log2-transformed expression levels higher or equal to 5 within at least one outcome group were considered. Finally, a differential analysis was carried out using the Empirical Bayes method with the limma package [24], and the list of genes significantly discriminating between relapsing and non-relapsing groups was retained with a False Discovery Rate (FDR) [25] adjusted p-value of <0.05 and a fold change (FC) of at least ±2. Overrepresented biological functions and pathways (biological processes, cellular components and molecular functions) that were associated with the differentially expressed genes were assessed using the GOstats [26] package in Bioconductor.

2.3. RNA-Sequencing Data

From the 48 patient biopsies, 39 (18 relapsing and 21 non-relapsing) were retained for RNA-sequencing analysis. After ribodepletion (NEBNext® rRNA Depletion HMR kit from NEB), RNA-seq libraries were prepared using NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina® (NEB) and sequenced with Novaseq 6000 (ILLUMINA). The libraries’ preparations were realized following the manufacturer’s recommendations then sequenced to obtain 2 × 200 million 150-base reads per sample.

2.4. Validation of Microarray Signature Using High-Throughput Quantitative PCR Method

The oligonucleotide primer pairs used for the qPCR were designed with PrimerBLAST (http://www.ncbi.nlm.nih.gov/tools/primer-blast/, accessed on 26 September 2021) to target the CDS region of the variants detected by the selected Affymetrix probe sets. Primer Tms were calculated using Schildkraut and Lifson’s 1965 salt-correction formula and Breslauer’s 1986 table of thermodynamic parameters. The primer design was performed to avoid genomic DNA (gDNA) amplification. gDNA amplification was controlled during the primer validation and in the high-throughput qPCR by adding a positive control of gDNA (G147, Promega®, Charbonnières-les-Bains, France) and by a valid prime assay, which accurately corrects all reactions in BioMark Array for signals derived from gDNA [27]. Primer sequences are reported in the Table S2. PCR specificity was verified by assessing the melting curves of each amplification product. Primer efficiency has been tested on a pool of samples by standard qPCR (Table S2) prior to high-throughput qPCR. All qPCR assays were performed in duplicate. After a pre-amplification of cDNA, validation of the differentially expressed genes was performed using 96.96 Dynamic Arrays for the BioMark™ system (Fluidigm CorporatioSan Francisco, CA, USA) [23] according to manufacturer’s instructions. An initial data analysis was performed with the Fluidigm real-time PCR analysis software using the linear derivative baseline correction, a quality correction set to 0.65, and the User (Detectors) Cycle Threshold. The cq (quantification cycle) ranged from 6.7 to 22.7 which signed for a successful experiment [28]. The cts for undetectable targets were set at 31. The mean expression of MLN51 and TBP, selected as the best housekeeping genes using Genorm® and Normfinder® with the R package NormqPCR, was used as a normalization factor to calculate ∆Cq values (1): [∆Cq The −ΔCq values were used for heatmap and boxplot (Beeswarm package, https://rdrr.io/cran/beeswarm/man/beeswarm.html, accessed on 26 September 2021) generation by using the R software (version 3.1.2). The validation of the microarray signature was conducted using ΔCq values after an assessment for, first, an adjusted p-value from the Wilcoxon test, followed by a Benjamini–Hoechberg correction lower than 0.05, then a Pearson’s correlation between high-throughput qPCR and microarray data greater than 0.7.

2.5. Clinical Outcome Based on High-Throughput RT-qPCR Data

The validation of microarray signatures was carried out using ΔCq values after assessments for p-values from a Wilcoxon test followed by a Benjamini–Hoechberg correction. The selection criteria were a p-value lower than 0.05 and a Pearson’s correlation between high-throughput RT-qPCR and microarray data greater than 0.7. A two-step scheme to select the genes best discriminating between outcomes was established using ΔCq values. The first step involved two complementary methods based on distinct approaches that reach the same goal [29]: Random Forest (RF, using the random Forest package [30], n = 500 trees) and Partial Least Squares Discriminant Analysis (PLS-DA, using the DiscriMiner package, http://cran.r-project.org/web/packages/DiscriMiner/index.html, accessed on 26 September 2021). For RF, 70% of the cohort (34 cases) formed a training set and the remaining 14 tumors formed the test set. Each set had approximately the same proportion of relapsing and non-relapsing cases as the whole cohort. A PLS-DA algorithm was associated with leave-on-out cross-validation. We selected the top five genes from each method, ranked by significance (using the Gini index and VIP [variable importance for the projection] index, respectively). These index values represent a quantitative statistical parameter ranking genes according to their ability to discriminate between the two outcome groups. Selected genes were then used to develop a logistic regression model with a backward selection method using relapse as the outcome variable.

2.6. Transcripts Quantification and Differential Expression Analysis

The Kallisto v0.44.0 pseudo-alignment method [31] was used to quantify transcript abundances directly from the raw RNA-seq FASTQ files. This method, based on the pseudo alignment for rapid and accurate quantification, was performed with a 100 bootstrap value, using a transcriptome index constructed from the Ensembl project’s transcriptome v91. Spring Cloud Sleuth version 0.30.0 [32] was then used within R for differential expression analysis at the gene level (gene mode = TRUE) with an aggregation of the transcript abundances by Ensembl’s gene ID (aggregation_column = ‘ens_gene’). Poorly covered genes (read count <10 in more than half of the samples) were removed before any further analysis. Genes were then defined as differentially expressed (DE) depending on the corrected p-value (qval, adjusted p-values using the Benchamini–Hochberg method) from the Sleuth statistical test. We tested both the Wald test (WT) and the likelihood ratio test (LRT), which is more stringent.

3. Results

3.1. Clinical and Pathological Characteristics of Patients

Among the 48 patients (Table 1 and Table S1 and ref [33]), 31 were male and 17 were female. Most patients were children or young adults less than 22 years (n = 39). The median age at diagnosis was 12.5 years (range: 2–50 years). According to the Ann Arbor classification, 30 patients had advanced stage III or IV disease, and 18 had localised stage I or II disease. Twenty-two tumors were classified as common type and 26 as morphologic variants. The ALK gene was fused to the NPM gene in 44 tumors and to the TPM3 gene in the other cases, which corresponded to the different ALK staining patterns [17]. After front-line multi-agent chemotherapy, 45 patients achieved complete remission. Three patients progressed during treatment (median: 7.2 months; range: 2.4–16.5 months), and 23 patients relapsed within 16.5 months of diagnosis: these were all assigned to the relapsing group. Twenty-two remained disease-free after a period of at least three years and were included in the non-relapsing group.

3.2. Molecular Signatures from Microarray Data Associated with Clinical Outcome

Based on microarray data, a supervised method was used to find the most significant differentially expressed genes between relapsing and non-relapsing tumors. Using a significance level of corrected p-value <0.05 and a cut-off fold change of ±2 (Figure 1A), we generated a list of 47 significantly discriminating genes (61 probes), using the 14,388 probe sets that had a log2-transformed expression level ≥5 within at least one group (Figure 2A, Table 2, orange columns). Among the 47 genes, 14 genes were overexpressed in the relapsing group while 33 genes were overexpressed in the non-relapsing group (Figure 2A, Table 2, orange columns).
Figure 1

Workflow of development and validation. (A): Microaaray data and high- throughput qPCR workflow. (B): Gene selection using high- throughput qPCR. (C): Identification of differentially expressed gene between relapse and no-relapse groups using RNA sequencing.

Figure 2

Molecular signature associated with clinical outcome and Gene Ontology Biological Process enrichment. (A) Heatmap of microarray data showing the 47 deregulated genes in “relapsing” (n = 26, dark grey) compared to “non-relapsing” samples (n = 22, light grey). Each column represents a sample, and each row a probe set or gene. The expression level of each probe set was standardized by subtracting that probe set’s mean expression from its expression value and then dividing this by the standard deviation across all the samples. This scaled expression value, designated as the row Z-score, was plotted using a red–blue color scale with red indicating high expression and blue indicating low expression. (B) Enrichment of these deregulated genes within the Gene Ontology (GO) categories with the 10 most-listed GO biological processes categories (p < 0.01). The number of probe sets downregulated or upregulated in “relapsing” specimens is represented below. The p-values of each GO category are reported on the graph.

Table 2

Expression levels, fold change (FC), p-values, and rank of importance of the genes discriminating relapsing ALK+ and non-relapsing ALK+ tumors using microarray, high-throughput qPCR, and RNA sequencing data.

Microarray HG-U133-Plus2.0Fluidigm Data RNAseq Kallisto/Sleuth DE
Mean Log 2 Intensity Mean (-Delta)Cq WilcoxonCorrelation Microarray-Fluidigm Sleuth Wald TestMean Expression (tpm)
ProbeSetGeneSymbolNRRFCR vs. NRlogFCR vs. NRp ValueAdjustedp Value(BH)MeanNo_RelapseMeanRelapseFCR vs. NRlogFCR vs. NRp ValueAdjustedp Value(BH)PearsonCorrelationrp ValueCorresp. ENSGp ValueAdjustedp Value(BH)b (Effect Size ~ logFC Estimator) R vs. NRMeanNo_RelapseMeanRelapse
228471_atANKRD447.718.762.071.051.08 × 10−33.11 × 10−20.41.091.610.698.97 × 10−21.01 × 10−10.860.741.98 × 10−15ENSG00000065413.201.08 × 10−21.55 × 10−10.55
210031_atCD2477.939.072.211.142.10 × 10−34.53 × 10−2 ENSG00000198821.113.76 × 10−42.59 × 10−20.930.2154.51
222043_atCLU9.410.472.091.071.59 × 10−33.87 × 10−25.215.91.610.696.67 × 10−28.12 × 10−20.880.771.04 × 10−16ENSG00000120885.222.05 × 10−22.17 × 10−10.56
236717_at FAM179A 6.318.073.381.763.62 × 10−53.52 × 10−3−1.70−0.162.91.542.61 × 10−3 1.07 × 10−2 0.97 0.939.46 × 10−29ENSG00000189350.136.80 × 10−5 3.64 × 10−3 1.23 24.4458.53
205718_at ITGB7 7.138.62.771.471.78 × 10−34.14 × 10−2−0.361.072.71.435.35 × 10−3 1.34 × 10−2 0.97 0.957.59 × 10−31ENSG00000139626.161.55 × 10−4 1.88 × 10−2 1 22.9247.71
1558459_s_atLOC4013205.476.492.031.021.64 × 10−91.13 × 10−5−1.51−1.021.40.496.26 × 10−27.82 × 10−20.620.381.71 × 10−5
218202_x_atMRPL443.455.694.732.241.08 × 10−361.55 × 10−321.291.391.070.11.23 × 10−11.32 × 10−10.210.057.81 × 10−2ENSG00000135900.4NANANA
213733_at MYO1F 8.719.752.061.042.51 × 10−52.75 × 10−30.681.651.950.962.43 × 10−3 1.07 × 10−2 0.91 0.831.15 × 10−18ENSG00000142347.196.03 × 10−5 1.14 × 10−2 0.51 146.59229.13
212259_s_atPBXIP16.747.74212.49 × 10−34.99 × 10−20.881.721.790.843.67 × 10−25.51 × 10−20.950.92.43 × 10−25ENSG00000163346.171.90 × 10−41.99 × 10−20.5915.5326.77
206060_s_at PTPN22 7.498.842.531.346.31 × 10−42.29 × 10−21.392.522.181.134.35 × 10−3 1.22 × 10−2 0.93 0.873.21 × 10−22ENSG00000134242.162.70 × 10−5 1.79 × 10−3 0.9 43.8293.55
208010_s_atPTPN225.997.352.551.351.18 × 10−33.26 × 10−21.392.522.181.134.35 × 10−3 1.22 × 10−2 0.87 0.758.39 × 10−16
236539_atPTPN227.348.452.161.111.49 × 10−33.74 × 10−21.392.522.181.134.35 × 10−3 1.22 × 10−2 0.94 0.899.00 × 10−24
218394_atROGDI5.226.322.131.093.09 × 10−91.78 × 10−5−3.38−2.751.550.636.50 × 10−3 1.54 × 10−2 0.76 0.582.13 × 10−10ENSG00000067836.137.75 × 10−43.77 × 10−20.426.8810.24
227552_atSEPT16.367.522.231.152.32 × 10−34.82 × 10−2−0.500.682.261.185.05 × 10−26.89 × 10−20.930.873.74 × 10−22ENSG00000180096.121.51 × 10−35.48 × 10−20.64
223044_atSLC40A110.7812.032.381.252.57 × 10−41.38 × 10−21.192.542.551.351.39 × 10−37.20 × 10−30.940.895.92 × 10−24ENSG00000138449.111.21 × 10−14.94 × 10−1−0.46
244716_x_atTMIGD26.928.893.931.971.42 × 10−101.28 × 10−7−2.28−0.762.861.523.30 × 10−25.13 × 10−20.740.551.04 × 10−9ENSG00000167664.84.47 × 10−42.80 × 10−21.3119.2537.27
226997_atADAMTS126.885.790.47−1.083.61 × 10−41.68 × 10−2−0.78−1.460.63−0.684.15 × 10−25.84 × 10−20.910.825.00 × 10−19ENSG00000151388.114.53 × 10−42.80 × 10−2−0.565.493.14
ADAMTS12 ENSG00000281690.22.84 × 10−22.54 × 10−1−0.457.054.54
224694_at ANTXR1 8.756.950.29−1.807.56 × 10−56.01 × 10−31.14−0.170.41−1.307.37 × 10−3 1.66 × 10−2 0.96 0.912.24 × 10−26ENSG00000169604.201.15 × 10−5 1.14 × 10−3 −1.18 14.294.45
204345_atCOL16A18.156.980.45−1.171.16 × 10−33.24 × 10−20.780.050.6−0.735.30 × 10−27.00 × 10−20.940.886.02 × 10−23ENSG00000084636.185.14 × 10−31.08 × 10−1−0.75
221730_atCOL5A210.879.710.45−1.166.60 × 10−42.33 × 10−22.751.950.57−0.803.85 × 10−25.59 × 10−20.920.851.32 × 10−20ENSG00000204262.145.87 × 10−31.13 × 10−1−0.54
225681_at CTHRC1 11.6210.050.34−1.574.40 × 10−41.89 × 10−21.390.070.4−1.323.89 × 10−3 1.17 × 10−2 0.97 0.941.69 × 10−30ENSG00000164932.138.87 × 10−8 3.25 × 10−4 −0.78 29.9913.22
202450_s_atCTSK10.128.80.4−1.321.37 × 10−33.59 × 10−22.251.280.51−0.971.34 × 10−22.32 × 10−20.960.921.14 × 10−26ENSG00000143387.14NANANA
201893_x_at DCN 12.1410.870.42−1.271.37 × 10−33.59 × 10−23.982.750.43−1.223.78 × 10−3 1.17 × 10−2 0.95 0.92.71 × 10−25ENSG00000011465.18 8.81 × 10−5 1.37 × 10−2 −1.01 319.12132.97
211896_s_atDCN12.1310.720.37−1.421.08 × 10−33.11 × 10−23.982.750.43−1.223.78 × 10−3 1.17 × 10−2 0.94 0.883.10 × 10−23
211813_x_atDCN11.6610.120.34−1.554.38 × 10−41.89 × 10−23.982.750.43−1.223.78 × 10−3 1.17 × 10−2 0.93 0.861.51 × 10−21
201325_s_atEMP18.847.830.5−1.018.35 × 10−42.70 × 10−21.881.040.56−0.843.78 × 10−31.17 × 10−20.910.839.38 × 10−20ENSG00000134531.103.03 × 10−42.36 × 10−2−0.5878.5440.11
201324_atEMP110.779.740.49−1.038.68 × 10−56.61 × 10−31.881.040.56−0.843.78 × 10−3 1.17 × 10−2 0.92 0.851.76 × 10−20
209955_s_at FAP 8.286.440.28−1.845.71 × 10−42.21 × 10−20.86−0.810.31−1.678.08 × 10−4 5.35 × 10−3 0.98 0.951.23 × 10−32ENSG00000078098.149.08 × 10−5 4.08 × 10−3 −1.25 46.214.03
211719_x_at FN1 12.6911.330.39−1.362.04 × 10−41.18 × 10−24.683.170.35−1.512.12 × 10−4 2.54 × 10−3 0.98 0.965.68 × 10−34ENSG00000115414.211.45 × 10−5 1.27 × 10−3 −1.18 680.6193.53
214701_s_atFN16.294.670.33−1.625.11 × 10−71.50 × 10−44.683.170.35−1.512.12 × 10−4 2.54 × 10−3 0.73 0.533.20 × 10−9
210495_x_atFN112.2910.620.31−1.677.05 × 10−51.07 × 10−34.683.170.35−1.512.12 × 10−4 2.54 × 10−3 0.98 0.963.57 × 10−34
216442_x_atFN112.3310.650.31−1.699.99 × 10−51.35 × 10−34.683.170.35−1.512.12 × 10−4 2.54 × 10−3 0.98 0.971.58 × 10−35
212464_s_atFN112.3210.620.31−1.691.09 × 10−51.41 × 10−34.683.170.35−1.512.12 × 10−4 2.54 × 10−3 0.98 0.961.53 × 10−34
225481_at FRMD6 8.437.290.45−1.144.17 × 10−41.84 × 10−20.31−0.580.54−0.891.19 × 10−2 2.23 × 10−2 0.94 0.887.77 × 10−22ENSG00000139926.168.55 × 10−5 4.08 × 10−3 −0.80 26.7847.71
225464_atFRMD68.417.270.45−1.143.71 × 10−41.69 × 10−20.31−0.580.54−0.891.19 × 10−2 2.23 × 10−2 0.93 0.871.20 × 10−20
227070_at GLT8D2 8.26.950.42−1.252.01 × 10−34.44 × 10−2−0.78−1.690.53−0.912.54 × 10−2 4.09 × 10−2 0.94 0.897.62 × 10−22ENSG00000120820.125.54 × 10−5 3.20 × 10−3 −0.71 15.037.28
227059_atGPC68.156.180.25−1.972.46 × 10−52.75 × 10−3−0.81−2.770.26−1.961.09 × 10−4 2.54 × 10−3 0.97 0.949.60 × 10−30ENSG00000183098.11NANANA
201035_s_atHADH75.970.49−1.024.44 × 10−114.56 × 10−8−0.70−0.860.89−0.162.42 × 10−12.48 × 10−10.550.32.93 × 10−5ENSG00000138796.172.17 × 10−22.24 × 10−1−0.21
226218_atIL7R9.78.170.34−1.543.19 × 10−41.58 × 10−21.580.490.47−1.092.98 × 10−31.12 × 10−20.950.911.27 × 10−25ENSG00000168685.154.03 × 10−42.69 × 10−2−0.5973.0339.19
205798_atIL7R8.997.410.33−1.596.31 × 10−42.29 × 10−21.580.490.47−1.092.98 × 10−3 1.12 × 10−2 0.91 0.831.33 × 10−19
227140_at INHBA 9.266.50.15−2.761.09 × 10−51.41 × 10−30.08−1.940.25−2.012.82 × 10−4 2.54 × 10−3 0.96 0.912.79 × 10−26ENSG00000122641.113.05 × 10−5 8.21 × 10−3 −1.64 9.22.09
204686_at IRS1 6.615.450.45−1.169.09 × 10−51.27 × 10−3−1.31−1.770.72−0.477.18 × 10−2 8.50 × 10−2 0.77 0.591.12 × 10−10ENSG00000169047.51.37 × 10−4 1.76 × 10−2 −0.50 10.165.97
204682_atLTBP27.226.040.44−1.182.44 × 10−34.92 × 10−2−0.08−0.980.53−0.915.45 × 10−27.00 × 10−20.450.21.03 × 10−3ENSG00000119681.125.46 × 10−51.09 × 10−2−1.0213.285.61
201069_atMMP29.557.610.26−1.942.12 × 10−34.57 × 10−22.230.660.34−1.571.34 × 10−22.32 × 10−20.980.967.23 × 10−33ENSG00000087245.133.56 × 10−38.81 × 10−2−1.22
203936_s_atMMP97.786.220.34−1.562.24 × 10−34.72 × 10−20.94−0.580.35−1.525.01 × 10−31.33 × 10−20.960.917.85 × 10−26ENSG00000100985.76.81 × 10−31.22 × 10−1−1.03
203939_at NT5E 7.425.960.36−1.466.32 × 10−55.20 × 10−30.05−0.960.5−1.011.44 × 10−3 7.20 × 10−3 0.95 0.91.05 × 10−24ENSG00000135318.126.37 × 10−5 1.15 × 10−2 −0.62 6.523.31
204992_s_atPFN28.196.80.38−1.391.30 × 10−48.64 × 10−3 ENSG00000070087.149.83 × 10−31.47 × 10−1−0.46
205479_s_at PLAU 8.917.070.28−1.843.01 × 10−53.13 × 10−31.820.50.4−1.322.50 × 10−4 2.54 × 10−3 0.98 0.956.36 × 10−32ENSG00000122861.161.91 × 10−4 1.99 × 10−2 −0.95 5.892.46
210809_s_atPOSTN12.8810.810.24−2.071.19 × 10−48.13 × 10−31.42−0.380.29−1.803.89 × 10−31.17 × 10−20.940.882.33 × 10−23ENSG00000133110.152.52 × 10−42.17 × 10−2−1.37292.3775.76
1555778_a_atPOSTN11.228.860.19−2.362.37 × 10−41.29 × 10−21.42−0.380.29−1.803.89 × 10−31.17 × 10−20.910.823.07 × 10−19
202975_s_atRHOBTB38.077.060.5−1.019.35 × 10−42.88 × 10−21.330.970.78−0.363.26 × 10−13.26 × 10−10.670.441.49 × 10−7ENSG00000164292.133.70 × 10−22.88 × 10−1−0.33
212110_at SLC39A14 9.157.880.41−1.281.43 × 10−49.34 × 10−31.480.780.61−0.701.17 × 10−2 2.23 × 10−2 0.94 0.891.03 × 10−23ENSG00000104635.157.67 × 10−5 1.28 × 10−2 −0.74 24.8911.29
212354_at SULF1 10.368.440.26−1.921.82 × 10−41.10 × 10−22.230.50.3−1.732.30 × 10−4 2.54 × 10−3 0.98 0.951.47 × 10−32ENSG00000137573.146.56 × 10−8 3.25 × 10−4 −1.34 131.9637.09
212344_atSULF19.247.150.24−2.091.06 × 10−47.35 × 10−32.230.50.3−1.732.30 × 10−4 2.54 × 10−3 0.95 0.94.63 × 10−25
212353_atSULF110.338.190.23−2.146.30 × 10−55.20 × 10−32.230.50.3−1.732.30 × 10−4 2.54 × 10−3 0.98 0.972.01 × 10−36
206506_s_atSUPT3H6.285.260.49−1.022.36 × 10−55.22 × 10−4−1.11−1.260.9−0.151.39 × 10−11.45 × 10−10.430.181.36 × 10−3ENSG00000196284.172.45 × 10−22.37 × 10−1−0.20
203083_atTHEM47.56.460.39−1.351.86 × 10−52.19 × 10−3−10.85−18.804.04 × 10−3−7.951.14 × 10−11.25 × 10−10.680.468.14 × 10−8ENSG00000159445.133.52 × 10−38.78 × 10−2−0.32
1553118_at THBS2 9.828.470.49−1.041.93 × 10−34.33 × 10−22.31.350.52−0.951.83 × 10−2 3.05 × 10−2 0.98 0.953.75 × 10−32ENSG00000186340.163.38 × 10−5 8.78 × 10−3 −1.00 70.7428.41
219410_atTMEM45A8.557.10.37−1.457.23 × 10−42.49 × 10−2−0.67−1.710.49−1.047.99 × 10−3 1.71 × 10−2 0.97 0.939.69 × 10−29ENSG00000181458.102.32 × 10−22.30 × 10−1−0.19
220968_s_atTSPAN95.944.730.43−1.215.79 × 10−149.26 × 10−11−0.13−0.680.68−0.559.13 × 10−3 1.87 × 10−2 0.71 0.51.11 × 10−8ENSG00000011105.145.59 × 10−43.15 × 10−2−0.4511.056.43
243526_atWDR866.214.960.42−1.254.04 × 10−71.24 × 10−4−3.20−3.930.61−0.727.69 × 10−28.87 × 10−20.640.415.71 × 10−7ENSG00000187260.161.87 × 10−22.07 × 10−1−0.41
The most significantly overrepresented GO terms (biological processes, Figure 2B and Table S3) in the relapsing group were related to the regulation of the immune response: clusterin (CLU logFC:1.07; adjusted p value: 3.87 × 10−2), integrin beta7 ITGB7 (logFC: 1.47; adjusted p value: 4.14 × 10−2), the tyrosine phosphatase PTPN22 (logFC: 1.34; adjusted p value: 2.29 × 10−2), the unconventional myosin MYOF1 genes (logFC: 1.04; adjusted p value: 2.75 × 10−3) and T−cell activation (the CD3 zeta chain gene called as CD247; logFC: 1.14; adjusted p value: 4.53 × 10−2), and TMIGD2 (logFC: 1.97; adjusted p value: 1.28 × 10−7), a new member of the T−cell costimulatory/coinhibitory B7/CD28 families. In the non−relapsing group, highly expressed genes were involved in extracellular matrix (ECM) organization and disassembly: FN1 (fibronectin1; logFC: 1.25; adjusted p value: 1.18 × 10−2), DCN (decorin, logFC: 1.12; adjusted p value: 3.59 × 10−2), FAP (fibroblast activating protein, logFC: 1.59; adjusted p value: 1.68 × 10−2), ADAMTS12 (logFC: 1.39; adjusted p value: 2.21 × 10−2), MMP2 (logFC: 1.57; adjusted p value: 4.57 × 10−2), MMP9 (logFC: 1.55; adjusted p value: 4.72 × 10−2), and different collagen family members such as the COL16A1, COL5A2, ANTRX1, CTSK, and CTHRC1 genes. To validate the signatures of 47 genes, RT-qPCR using high-throughput Fluidigm® technology was performed on all biopsies. Expression of the CD247 and PFN2 genes was not taken into account in the final analysis because these primers formed dimers. Among the 45 remaining genes, 29 gave concordant results with an adjusted p-value of <0.05 and a Pearson’s correlation coefficient of >0.7 (Figure 1A, Table 2: blue columns; ANTXR1, CTHRC1, CTSK, DCN, EMP1, FAM179A, FAP, FN1, FRMD6, GLT8D2, GPC6, IL7R, INHBA, IRS1, ITGB7, MMP2, MMP9, MYO1F, NT5E, PLAU, POSTN, PTPN22, ROGDI, SLC39A14, SLC40A1, SULF1, THBS2, TMEM45A, TSPAN9).

3.3. Identification of a Minimum Set of Genes Associated with Clinical Outcome

Random Forest (RF) and Partial Least Squares Discriminant Analysis (PLS-DA) are two powerful tools for analysing microarray data. Because these two algorithms can highlight essential variables in a dataset, we used them as classification algorithms on high-throughput RT-qPCR data to identify the minimum set of genes whose expression in primary tumors is associated with clinical outcome (Figure 1A). Using RF analysis, the optimal gene classifier consisted of five genes: EMP1, SCL40A1, ITGB7, SULF1, and FAM179A, ranked according to their variable importance in the model (Figure 1B and Figure 3A). PLS-DA algorithms also gave an optimal gene classifier consisting of five genes in rank-order: FAM179A, MYOF1, SCL40A1, FN1, and PLAU (Figure 1B and Figure 3B). Therefore, RF and PLS-DA selected a total of 8 genes that could help classify relapsing and non-relapsing patients (Figure 1B and Figure 3C). We then tried to reduce the number of genes even more. Using a logistic regression on the ΔCq expression from using high-throughput Fluidigm® technology with these 8 genes, we identified a set of 3 genes (Figure 1B, Table S4): FN1/fibronectin 1, FAM179A (family with sequence similarity 179, member A), and SCL40A1/ferroportin-1. For these three genes, data generated by microarray, Fluidigm®, and standard RT-qPCR showed an excellent consistency (R2 > 0.89, Figure S1). Overexpressions of the 3 genes are validated in relapse groups using an independent cohort (n = 18, Figure S2). Finally, since all are located on chromosome 2 (2q34, 2p23.2 and 2q32, respectively), we verified that their differential expression was not related to the gain or deletion of their loci by high-resolution CGH array.
Figure 3

Selection of best predictive genes for outcome stratification by Random Forest and PLS-DA analysis of high-throughput RT-qPCR data. Relative importance of genes that discriminated between “relapsing” and “non-relapsing” groups in high-throughput RT-qPCR data. The bar plots show the mean Gini index of each gene from Random Forest classification (A) and variable importance in the projection (VIP) of the PLS-DA method with (B) larger values (to the right of the graph) indicating a more important gene within the model. The five top genes are highlighted by a grey box. (C) Box plots and strip-charts showing high-throughput qPCR quantification of the 8 selected genes in “relapsing” (grey, n = 26) and “non-relapsing” (white, n = 22) samples. Statistical significance was calculated using the Wilcoxon test followed by a Benjamini and Hoechberg correction. Expressions are given as (−∆Cq).

3.4. Transcripts Quantification with Pseudo-Alignment and Differential Expression Analysis by Total RNA-Sequencing

To find a gene’s signature from another transcriptomic technique, 18 relapsing and 21 non-relapsing tumors over the 48 patient biopsies (39/48 samples) were sequenced. Using full RNA 150-bp paired-end sequencing data (median of 507 million reads per patient), gene expression was quantified with Kallisto, a fast pseudoalignment-based method used to obtain transcript quantification from RNA sequencing data [31]. Genes differentially expressed (DE) between relapsing and non-relapsing conditions were selected with Sleuth, which is a program for the differential expression analysis of RNA-Seq experiments for which transcript abundances have been quantified with Kallisto [32]. With a corrected p-value < 0.02, 214 genes were found as DE between the two groups (relapse and no relapse) with the statistical Wald Test (WT, Figure 1C) and 62 with the more stringent Likelihood Ratio Test (LRT) (Table S5), which is a statistical test of the goodness-of-fit between two models. We finally retained the Wald Test’s most extensive list for further analysis because it gives a ‘beta’ value (size effect) that can be compared to logFC. Thus, finally, 168 genes having an absolute log2 FC between relapse and no-relapse groups greater than 0.5 and a p-value lower than 0.02 [34] were selected (Figure S3). After intersecting these 168 DE genes with the 47 significantly discriminating genes previously found with the microarray technique (p value < 0.05), 20 common genes were highlighted (ANTXR1, CTHRC1, DCN, FAM179A/TOGARAM2, FAP, FN1, FRMD6, GLT8D2, INHBA, IRS1, ITGB7, MYO1F, NT5E, PLAU, PBXIP1, PTPN22, SLC39A14, SULF1, THBS2, and LTBP2) (beta value/WT “log2FC” estimator greater than 0.5 and p value < 0.02; Table 2: green columns, red lines) including 5 and 15 genes overexpressed in relapse and no-relapse groups, respectively (Figures S4 and S5). On these 20 genes, 18 (ANTXR1, CTHRC1, DCN, FAM179A/TOGARAM2, FAP, FN1, FRMD6, GLT8D2, INHBA, IRS1, ITGB7, MYO1F, NT5E, PLAU, PTPN22, SLC39A14, SULF1, and THBS2) have also been validated with high-throughput Fluidigm® technology (Figure S5). Among them, the FAM179A and FN1 genes were already selected after logistic regression on the ΔCq expression from the high-throughput Fluidigm® technology data (Table S4).

4. Discussion

Although systemic ALK+ ALCL are highly chemosensitive tumors, with a 5-year OS rate of 80%, 30% usually experience relapse within the year following the end of treatment. Moreover, these “early” relapses are associated with a bad prognosis [35]. In the present study, we sought to identify a molecular signature that was associated with clinical outcome (relapse/progression versus non-relapse) in systemic ALK+ ALCL. From a cohort of 48 tumor samples obtained at diagnosis, our supervised analysis based on micro-array data identified 47 genes that significantly discriminated the two groups. Twenty of them were also found to be differentially expressed by RNA sequencing, supporting their biological significance. In the microarray molecular signature of the relapsing group, the most significant p-values included the overexpression of six genes (FAM179A, ITGB7, MYOF1, SLC40A1 or Ferroportin-1, PTPN22, and ROGDI). Many of the genes that were overrepresented and up-regulated in this group were implicated in the regulation of the immune response and in T-cell activation and proliferation. For the non-relapsing group, INHBA, GPC6, SULF1, FN1, PLAU, and FAP were the top six genes overexpressed with the most significant p-values. Eight of them were also differentially expressed using RNA-seq analysis (FAM179A, ITGB7, MYOF1, and PTPN22 in the relapse group, and FAP, FN1, and INHBA, SULF1 in the no-relapse group). Within the genes overexpressed in this non-relapsing microarray signature, there was a statistically significant overrepresentation of genes involved in extracellular matrix (ECM) deposition and organization. The ECM is a highly dynamic structure which is constantly being remodeled and, in the appropriate context, might restrain malignant tumor progression. Although excessive ECM deposition could hinder the diffusion of therapeutic agents [36] and play a role in cell adhesion-mediated drug resistance [37], proteases secreted by tumor cells and/or cells of the micro-environment could lead to its structure breakdown and influence the tumor cell response to chemotherapy. Furthermore, although proteases have long been considered as cancer-promoting factors, recent studies have revealed that they can also elicit tumor-suppressive effects through the stimulation of apoptosis or the inhibition of angiogenesis [38]. This ECM signature probably reflects a strong ECM deposition that could be associated with a peculiar tumor microenvironment less favorable for tumor cells. Interestingly, 19 of the 33 overexpressed genes in the microarray non-relapsing signature and 13 out of the 16 genes in the RNA sequencing non-relapsing signature also belong to the “stromal-1 signature” (including FN1) associated with a better EFS and OS in diffuse large B-cell lymphomas (DLBCL) treated by CHOP or R-CHOP [39]. Thus, our molecular signatures point out that the ECM could be involved in the prognosis and the therapeutic response in ALCL, as it has already been suggested in DLBCL.

5. Conclusions

We have identified a minimum set of genes whose expression could help to predict clinical outcome at diagnosis. Using two different classification algorithms, we identified 8 genes to be the most powerful at discriminating between tumors that did or did not experience relapse. Intersecting data from microarrays, high-throughput Fluidigm, and RNA-sequencing, this number of genes was further reduced to FAM179A and FN1. As FN1 is an ECM key regulator, we suggest that it might be involved in the prognosis and therapeutic response in ALCL, as already suggested in DLBCL.
  37 in total

1.  Using GOstats to test gene lists for GO term association.

Authors:  S Falcon; R Gentleman
Journal:  Bioinformatics       Date:  2006-11-10       Impact factor: 6.937

2.  Anaplastic large cell lymphoma in Japanese children: retrospective analysis of 34 patients diagnosed at the National Research Institute for Child Health and Development.

Authors:  Tetsuya Mori; Nobutaka Kiyokawa; Hiroyuki Shimada; Jun Miyauchi; Junichiro Fujimoto
Journal:  Br J Haematol       Date:  2003-04       Impact factor: 6.998

3.  Anaplastic large cell lymphoma treated with a leukemia-like therapy: report of the Italian Association of Pediatric Hematology and Oncology (AIEOP) LNH-92 protocol.

Authors:  Angelo Rosolen; Marta Pillon; Alberto Garaventa; Roberta Burnelli; Emanuele S d'Amore; Maria Giuliano; Margherita Comis; Simone Cesaro; Katia Tettoni; Maria Luisa Moleti; Paolo Tamaro; Gianluca Visintin; Luigi Zanesco
Journal:  Cancer       Date:  2005-11-15       Impact factor: 6.860

4.  Should treatment of ALK-positive anaplastic large cell lymphoma be stratified according to minimal residual disease?

Authors:  Charlotte Rigaud; Rachid Abbas; David Grand; Véronique Minard-Colin; Nathalie Aladjidi; Nimrod Buchbinder; Nathalie Garnier; Geneviève Plat; Marie-Laure Couec; Mylène Duplan; Anne Lambilliotte; Claudine Schmitt; Thierry Leblanc; Laurence Lamant; Laurence Brugières
Journal:  Pediatr Blood Cancer       Date:  2021-03-09       Impact factor: 3.167

5.  Short-pulse B-non-Hodgkin lymphoma-type chemotherapy is efficacious treatment for pediatric anaplastic large cell lymphoma: a report of the Berlin-Frankfurt-Münster Group Trial NHL-BFM 90.

Authors:  K Seidemann; M Tiemann; M Schrappe; E Yakisan; I Simonitsch; G Janka-Schaub; W Dörffel; M Zimmermann; G Mann; H Gadner; R Parwaresch; H Riehm; A Reiter
Journal:  Blood       Date:  2001-06-15       Impact factor: 22.113

Review 6.  Remodelling the extracellular matrix in development and disease.

Authors:  Caroline Bonnans; Jonathan Chou; Zena Werb
Journal:  Nat Rev Mol Cell Biol       Date:  2014-12       Impact factor: 94.444

7.  Gene-expression profiling of systemic anaplastic large-cell lymphoma reveals differences based on ALK status and two distinct morphologic ALK+ subtypes.

Authors:  Laurence Lamant; Aurélien de Reyniès; Marie-Michèle Duplantier; David S Rickman; Frédérique Sabourdy; Sylvie Giuriato; Laurence Brugières; Philippe Gaulard; Estelle Espinos; Georges Delsol
Journal:  Blood       Date:  2006-10-31       Impact factor: 22.113

8.  Prognostic significance of circulating tumor cells in bone marrow or peripheral blood as detected by qualitative and quantitative PCR in pediatric NPM-ALK-positive anaplastic large-cell lymphoma.

Authors:  Christine Damm-Welk; Kerstin Busch; Birgit Burkhardt; Jutta Schieferstein; Susanne Viehmann; Ilske Oschlies; Wolfram Klapper; Martin Zimmermann; Jochen Harbott; Alfred Reiter; Willi Woessmann
Journal:  Blood       Date:  2007-03-28       Impact factor: 22.113

9.  Stromal gene signatures in large-B-cell lymphomas.

Authors:  G Lenz; G Wright; S S Dave; W Xiao; J Powell; H Zhao; W Xu; B Tan; N Goldschmidt; J Iqbal; J Vose; M Bast; K Fu; D D Weisenburger; T C Greiner; J O Armitage; A Kyle; L May; R D Gascoyne; J M Connors; G Troen; H Holte; S Kvaloy; D Dierickx; G Verhoef; J Delabie; E B Smeland; P Jares; A Martinez; A Lopez-Guillermo; E Montserrat; E Campo; R M Braziel; T P Miller; L M Rimsza; J R Cook; B Pohlman; J Sweetenham; R R Tubbs; R I Fisher; E Hartmann; A Rosenwald; G Ott; H-K Muller-Hermelink; D Wrench; T A Lister; E S Jaffe; W H Wilson; W C Chan; L M Staudt
Journal:  N Engl J Med       Date:  2008-11-27       Impact factor: 91.245

10.  Long non-coding RNA exploration for mesenchymal stem cell characterisation.

Authors:  Sébastien Riquier; Marc Mathieu; Chloé Bessiere; Anthony Boureux; Florence Ruffle; Jean-Marc Lemaitre; Farida Djouad; Nicolas Gilbert; Thérèse Commes
Journal:  BMC Genomics       Date:  2021-06-04       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.