Literature DB >> 28560273

A reconstructed melanoma data set for evaluating differential treatment benefit according to biomarker subgroups.

Jaya M Satagopan1, Alexia Iasonos1, Joseph G Kanik1.   

Abstract

The data presented in this article are related to the research article entitled "Measuring differential treatment benefit across marker specific subgroups: the choice of outcome scale" (Satagopan and Iasonos, 2015) [1]. These data were digitally reconstructed from figures published in Larkin et al. (2015) [2]. This article describes the steps to digitally reconstruct patient-level data on time-to-event outcome and treatment and biomarker groups using published Kaplan-Meier survival curves. The reconstructed data set and the corresponding computer programs are made publicly available to enable further statistical methodology research.

Entities:  

Year:  2017        PMID: 28560273      PMCID: PMC5435579          DOI: 10.1016/j.dib.2017.05.005

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data The data set presents reconstructed information on progression free survival in metastatic melanoma patients and could be used by other researchers. This reconstructed data set allows other researchers to develop statistical methodologies for evaluating differential treatment benefit according to biomarker level. This reconstructed data set allows other researchers to extend the statistical analyses and compare the results to other similar studies.

Data

We present reconstructed data based on Fig. 1B and C of Larkin et al. [2]. The reconstructed data set includes information on time to disease progression, progression status, treatment, and the status of programmed death 1 ligand expression for 843 metastatic melanoma patients: 620 with negative expression (210 randomized to the combination therapy arm, 202 to ipilimumab monotherapy and 208 to nivolumab monotherapy) and 223 with positive expression (68 randomized to the combination therapy arm, 75 to ipilimumab monotherapy and 80 to nivolumab monotherapy). The reconstructed data are only approximate data to facilitate statistical methodology research, and do not represent actual patient-level data. These reconstructed data are new and original in the sense that the reconstructed time to progression free survival and progression status data has not been published elsewhere.

Experimental design, materials and methods

We used the following steps to reconstruct data from Figs. 1B and 1C of Larkin et al. [2]. Step 1: Isolating individual lines from Kaplan-Meier figures Fig. 1C of Larkin et al. [2] contains 3 lines representing the Kaplan-Meier estimates of survival probabilities for patients with negative programmed death 1 ligand expression randomized to nivolumab monotherapy, ipilimumab monotherapy and combination therapy. Isolate these 3 lines using Adobe Illustrator [3], as described in Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7. Use similar methods to isolate the 3 lines from Fig. 1B of Larkin et al. [2] that correspond to patients with positive programmed death 1 ligand expression. Save the isolated lines as separate jpeg files.
Fig. 1

Fig. 1C of Larkin et al. [2] imported into Adobe Illustrator.

Fig. 2

Select the overall image and head to the top option to “Image Trace”, selecting the arrow on the right and choosing “High Fidelity Photo”. Next, select the button on the right of where Image Trace was, “Expand”.

Fig. 3

The figure in Adobe Illustrator after expanding via Image Trace.

Fig. 4

It is now possible to select each line with just a click of the button. Because the trace was for a “High Fidelity Photo”, Adobe Illustrator is able to understand that left clicking an orange line should highlight the entirety of the orange line and nothing else as displayed in this figure. Now, each line can be removed to obtain separate files for each line of data.

Fig. 5

The isolated Nivolumab line in Adobe Illustrator.

Fig. 6

The isolated Ipilimumab line in Adobe Illustrator.

Fig. 7

The isolated Nivolumab plus Ipilimumab (combination therapy) line in Adobe Illustrator.

Step 2: Digital extraction of time and survival probabilities Fig. 1C of Larkin et al. [2] imported into Adobe Illustrator. Select the overall image and head to the top option to “Image Trace”, selecting the arrow on the right and choosing “High Fidelity Photo”. Next, select the button on the right of where Image Trace was, “Expand”. The figure in Adobe Illustrator after expanding via Image Trace. It is now possible to select each line with just a click of the button. Because the trace was for a “High Fidelity Photo”, Adobe Illustrator is able to understand that left clicking an orange line should highlight the entirety of the orange line and nothing else as displayed in this figure. Now, each line can be removed to obtain separate files for each line of data. The isolated Nivolumab line in Adobe Illustrator. The isolated Ipilimumab line in Adobe Illustrator. The isolated Nivolumab plus Ipilimumab (combination therapy) line in Adobe Illustrator. Consider a jpeg file containing a single line – for example, the jpeg file corresponding to Fig. 7. Launch the DigitizeIt software package [4] in your computer and open this jpeg file. To digitize the line, select the desired minimum and maximum points on the horizontal (i.e., x) and vertical (i.e., y) axes, click the “Line” icon and left click the mouse on any part of the line. This will digitize the line and show the times (x-axis) and survival probability estimates (y-axis) in the output frame, which can be saved as a text file. The demo video in the DigitizeIt software page [4] gives a detailed description of this step. Apply this step to each jpeg file to obtain 6 text files. Step 3: Reconstructing patient-level data To obtain patient-level data, first pre-process the (x,y) values corresponding to each line obtained in Step 2 using Program 1. Next, use these parameters as the input for Program 2, which is an R function written by Guyot et al. [6], to obtain the reconstructed patient-level data. These steps are shown in Fig. 8, Fig. 9, Fig. 10, Fig. 11, Fig. 12, Fig. 13, Fig. 14.
Fig. 8

First, read the two programs “program-1.R” and “program-2.R” using the “source” command in R. Here “program-1.R” contains the R function “preprocess.digitized.data” to perform the pre-processing step, and “program-2.R” contains the R function “Guyot.individual.data” that performs survival probability inversion steps described by Guyot et al. [6] to reconstruct patient-level data. These functions can be downloaded from https://www.mskcc.org/sites/default/files/node/137932/documents/2017-04-20-14-31-36/dataexample.zip. Next, create an R object “digitized.file.names”, which is a character vector of the names of the text files containing the (x,y) data for the 6 lines. We have named the files as “pdl1-negative-nivo.txt”, “pdl1-negative-ipi.txt” etc.

Fig. 9

Create an R object “numbers.below.figure” as a list containing 6 elements. Each element is a vector containing the numbers at risk given below Figs. 1B and C of Larkin et al. [2].

Fig. 10

Create an R object “time” as a list containing 6 vectors. Each vector is a set of integers giving the time points along the x-axis of Figs. 1B and C of Larkin et al. [2]. The commented items referred to as “arm indicator” denote the treatment/biomarker arm. This is a simple book-keeping strategy for the user to note that the first file to be digitized corresponds to data from patients with negative programmed death 1 ligand expression receiving nivolumab (denoted “pdl1.neg.nivo”), the second file corresponds to negative programmed death 1 ligand expression receiving ipilimumab (denoted “pdl1-neg-ipi”) etc.

Fig. 11

The R object “individual.data” will contain the patient-level digitized data. This object is assembled by running the functions preprocess.digitized.data (in program-1.R) and Guyot.individual.data (in program-2.R) using the (x,y) data sets corresponding to each of the 6 digitized lines. The “for” loop runs these functions for each (x,y) data set.

Fig. 12

R output showing the first 20 rows of the digitized patient level data. These are the first 20 rows of the object “individual.data”. Column 1 gives the progression free survival time, Column 2 is the event status (1 = disease progression, 0 = no progression). Column 3 is treatment arm number indicating the treatment/biomarker arm, which takes values 1, 2, 3, 4, 5 or 6 (see Fig. 10). These first 20 patients have treatment arm number as 1 in Column 3 since these are patients with negative programmed death 1 ligand expression receiving nivolumab treatment. The data for all the 843 patients can be downloaded from https://www.mskcc.org/sites/default/files/node/137932/documents/2017-04-20-14-31-36/dataexample.zip.

Fig. 13

R commands to convert the treatment arm indicator numbers 1, 2, 3, 4, 5, 6 to treatment names (“nivolumab”, “ipilimumab” and “combination”) and programmed death 1 ligand status (“negative” and “positive”), and to append columns for treatment names and expression status to the patient-level data object “individual.data”.

Fig. 14

R output showing reconstructed patient-level data for the first 20 patients. The first 3 columns are the same as in Fig. 12. Columns 4 and 5 are the newly appended data on treatment and programmed death 1 ligand expression status using the commands shown in Fig. 13. The data for all 843 patients are given in https://www.mskcc.org/sites/default/files/node/137932/documents/2017-04-20-14-31-36/dataexample.zip.

First, read the two programs “program-1.R” and “program-2.R” using the “source” command in R. Here “program-1.R” contains the R function “preprocess.digitized.data” to perform the pre-processing step, and “program-2.R” contains the R function “Guyot.individual.data” that performs survival probability inversion steps described by Guyot et al. [6] to reconstruct patient-level data. These functions can be downloaded from https://www.mskcc.org/sites/default/files/node/137932/documents/2017-04-20-14-31-36/dataexample.zip. Next, create an R object “digitized.file.names”, which is a character vector of the names of the text files containing the (x,y) data for the 6 lines. We have named the files as “pdl1-negative-nivo.txt”, “pdl1-negative-ipi.txt” etc. Create an R object “numbers.below.figure” as a list containing 6 elements. Each element is a vector containing the numbers at risk given below Figs. 1B and C of Larkin et al. [2]. Create an R object “time” as a list containing 6 vectors. Each vector is a set of integers giving the time points along the x-axis of Figs. 1B and C of Larkin et al. [2]. The commented items referred to as “arm indicator” denote the treatment/biomarker arm. This is a simple book-keeping strategy for the user to note that the first file to be digitized corresponds to data from patients with negative programmed death 1 ligand expression receiving nivolumab (denoted “pdl1.neg.nivo”), the second file corresponds to negative programmed death 1 ligand expression receiving ipilimumab (denoted “pdl1-neg-ipi”) etc. The R object “individual.data” will contain the patient-level digitized data. This object is assembled by running the functions preprocess.digitized.data (in program-1.R) and Guyot.individual.data (in program-2.R) using the (x,y) data sets corresponding to each of the 6 digitized lines. The “for” loop runs these functions for each (x,y) data set. R output showing the first 20 rows of the digitized patient level data. These are the first 20 rows of the object “individual.data”. Column 1 gives the progression free survival time, Column 2 is the event status (1 = disease progression, 0 = no progression). Column 3 is treatment arm number indicating the treatment/biomarker arm, which takes values 1, 2, 3, 4, 5 or 6 (see Fig. 10). These first 20 patients have treatment arm number as 1 in Column 3 since these are patients with negative programmed death 1 ligand expression receiving nivolumab treatment. The data for all the 843 patients can be downloaded from https://www.mskcc.org/sites/default/files/node/137932/documents/2017-04-20-14-31-36/dataexample.zip. R commands to convert the treatment arm indicator numbers 1, 2, 3, 4, 5, 6 to treatment names (“nivolumab”, “ipilimumab” and “combination”) and programmed death 1 ligand status (“negative” and “positive”), and to append columns for treatment names and expression status to the patient-level data object “individual.data”. R output showing reconstructed patient-level data for the first 20 patients. The first 3 columns are the same as in Fig. 12. Columns 4 and 5 are the newly appended data on treatment and programmed death 1 ligand expression status using the commands shown in Fig. 13. The data for all 843 patients are given in https://www.mskcc.org/sites/default/files/node/137932/documents/2017-04-20-14-31-36/dataexample.zip.

Funding sources

This work was supported by research grants R01 CA137420, R01 CA197402 and P30 CA008748 from the National Cancer Institute, USA, and grant UL1RR024996 from the Clinical and Translational Science Center at Weill Cornell Medical College, New York, USA. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Subject areaBiostatistics
More specific subject areaClinical Biostatistics
Type of dataText file
How data was acquiredDigital extraction techniques and statistical methods using Adobe Illustrator[3], DigitizeIt software package[4]and the R programming language[5]
Data formatRaw
Experimental factorsA total of 843 melanoma patients with positive or negative programmed death 1 ligand expression were randomized to receive nivolumab monotherapy, ipilimumab monotherapy or combination therapy. The study has 6 subgroups defined by 3 treatments and two levels of programmed death 1 ligand expression.
Experimental featuresIndividual patient data were extracted from Kaplan-Meier figures and the number at risk reported below the figures for each of the 6 subgroups
Data source locationKaplan-Meier figures published in Figs. 1B and 1C of Larkin et al.[2]
Data accessibilityThe reconstructed data and R functions are available athttps://www.mskcc.org/sites/default/files/node/137932/documents/2017-04-20-14-31-36/dataexample.zip
Related research articleJ. M. Satagopan, A. Iasonos, Measuring differential treatment benefit across marker specific subgroups: the choice of outcome scale, Contemp Clin Trials.[1]
  3 in total

1.  Combined Nivolumab and Ipilimumab or Monotherapy in Untreated Melanoma.

Authors:  James Larkin; Vanna Chiarion-Sileni; Rene Gonzalez; Jean Jacques Grob; C Lance Cowey; Christopher D Lao; Dirk Schadendorf; Reinhard Dummer; Michael Smylie; Piotr Rutkowski; Pier F Ferrucci; Andrew Hill; John Wagstaff; Matteo S Carlino; John B Haanen; Michele Maio; Ivan Marquez-Rodas; Grant A McArthur; Paolo A Ascierto; Georgina V Long; Margaret K Callahan; Michael A Postow; Kenneth Grossmann; Mario Sznol; Brigitte Dreno; Lars Bastholt; Arvin Yang; Linda M Rollin; Christine Horak; F Stephen Hodi; Jedd D Wolchok
Journal:  N Engl J Med       Date:  2015-05-31       Impact factor: 91.245

2.  Measuring differential treatment benefit across marker specific subgroups: The choice of outcome scale.

Authors:  Jaya M Satagopan; Alexia Iasonos
Journal:  Contemp Clin Trials       Date:  2017-02-22       Impact factor: 2.226

3.  Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan-Meier survival curves.

Authors:  Patricia Guyot; A E Ades; Mario J N M Ouwens; Nicky J Welton
Journal:  BMC Med Res Methodol       Date:  2012-02-01       Impact factor: 4.615

  3 in total
  8 in total

1.  Measuring differential treatment benefit across marker specific subgroups: The choice of outcome scale.

Authors:  Jaya M Satagopan; Alexia Iasonos
Journal:  Contemp Clin Trials       Date:  2017-02-22       Impact factor: 2.226

2.  Modeling the Cost-Effectiveness of Adjuvant Osimertinib for Patients with Resected EGFR-mutant Non-Small Cell Lung Cancer.

Authors:  Christopher A Lemmon; Emily C Zabor; Nathan A Pennell
Journal:  Oncologist       Date:  2022-05-06       Impact factor: 5.837

3.  Quantitative exploration of factors influencing psychotic disorder ailments in Nigeria.

Authors:  Adebowale O Adejumo; Nehemiah A Ikoba; Esivue A Suleiman; Hilary I Okagbue; Pelumi E Oguntunde; Oluwole A Odetunmibi; Obalowu Job
Journal:  Data Brief       Date:  2017-07-24

4.  Quantitative evaluation of pregnant women delivery status' records in Akure, Nigeria.

Authors:  Adebowale O Adejumo; Esivue A Suleiman; Hilary I Okagbue; Pelumi E Oguntunde; Oluwole A Odetunmibi
Journal:  Data Brief       Date:  2017-11-14

5.  Assessment of Treatment Effects and Long-term Benefits in Immune Checkpoint Inhibitor Trials Using the Flexible Parametric Cure Model: A Systematic Review.

Authors:  Thomas Filleron; Marine Bachelier; Julien Mazieres; Maurice Pérol; Nicolas Meyer; Elodie Martin; Fanny Mathevet; Jean-Yves Dauxois; Raphael Porcher; Jean-Pierre Delord
Journal:  JAMA Netw Open       Date:  2021-12-01

6.  IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curves.

Authors:  Na Liu; Yanhong Zhou; J Jack Lee
Journal:  BMC Med Res Methodol       Date:  2021-06-01       Impact factor: 4.615

7.  Data in support of high rate of pregnancy related deaths in Maiduguri, Borno State, Northeast Nigeria.

Authors:  Patience I Adamu; Muminu O Adamu; Hilary I Okagbue
Journal:  Data Brief       Date:  2018-03-16

8.  Conditional disease-free survival in high-risk renal cell carcinoma treated with sunitinib.

Authors:  Ning Shao; Hengchuan Su; Dingwei Ye
Journal:  Aging (Albany NY)       Date:  2019-12-11       Impact factor: 5.682

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.