| Literature DB >> 34504347 |
Li Tai Fang1, Bin Zhu2, Yongmei Zhao3, Wanqiu Chen4, Zhaowei Yang4,5, Liz Kerrigan6, Kurt Langenbach6, Maryellen de Mars6, Charles Lu7, Kenneth Idler7, Howard Jacob7, Yuanting Zheng8, Luyao Ren8, Ying Yu8, Erich Jaeger9, Gary P Schroth9, Ogan D Abaan9, Keyur Talsania3, Justin Lack3, Tsai-Wei Shen3, Zhong Chen4, Seta Stanbouly4, Bao Tran10, Jyoti Shetty10, Yuliya Kriga10, Daoud Meerzaman11, Cu Nguyen11, Virginie Petitjean12, Marc Sultan12, Margaret Cam13, Monika Mehta10, Tiffany Hung14, Eric Peters14, Rasika Kalamegham14, Sayed Mohammad Ebrahim Sahraeian1, Marghoob Mohiyuddin1, Yunfei Guo1, Lijing Yao1, Lei Song2, Hugo Y K Lam1, Jiri Drabek15,16, Petr Vojta15,16, Roberta Maestro16,17, Daniela Gasparotto16,17, Sulev Kõks16,18,19, Ene Reimann16,19, Andreas Scherer16,20, Jessica Nordlund16,21, Ulrika Liljedahl16,21, Roderick V Jensen22, Mehdi Pirooznia23, Zhipan Li24, Chunlin Xiao25, Stephen T Sherry25, Rebecca Kusko26, Malcolm Moos27, Eric Donaldson28, Zivana Tezak29, Baitang Ning30, Weida Tong30, Jing Li5, Penelope Duerken-Hughes31, Claudia Catalanotti32, Shamoni Maheshwari32, Joe Shuga32, Winnie S Liang33, Jonathan Keats33, Jonathan Adkins33, Erica Tassone33, Victoria Zismann33, Timothy McDaniel33, Jeffrey Trent33, Jonathan Foox34, Daniel Butler34, Christopher E Mason34, Huixiao Hong35, Leming Shi36, Charles Wang37,38, Wenming Xiao39.
Abstract
The lack of samples for generating standardized DNA datasets for setting up a sequencing pipeline or benchmarking the performance of different algorithms limits the implementation and uptake of cancer genomics. Here, we describe reference call sets obtained from paired tumor-normal genomic DNA (gDNA) samples derived from a breast cancer cell line-which is highly heterogeneous, with an aneuploid genome, and enriched in somatic alterations-and a matched lymphoblastoid cell line. We partially validated both somatic mutations and germline variants in these call sets via whole-exome sequencing (WES) with different sequencing platforms and targeted sequencing with >2,000-fold coverage, spanning 82% of genomic regions with high confidence. Although the gDNA reference samples are not representative of primary cancer cells from a clinical sample, when setting up a sequencing pipeline, they not only minimize potential biases from technologies, assays and informatics but also provide a unique resource for benchmarking 'tumor-only' or 'matched tumor-normal' analyses.Entities:
Mesh:
Year: 2021 PMID: 34504347 PMCID: PMC8532138 DOI: 10.1038/s41587-021-00993-6
Source DB: PubMed Journal: Nat Biotechnol ISSN: 1087-0156 Impact factor: 68.164