Kai-Po Chang1,2, Yen-Wei Chu3,4, John Wang1. 1. Department of Pathology, China Medical University Hospital, Taichung 404, Taiwan. 2. Ph.D. Program in Medical Biotechnology, National Chung Hsing University, Taichung 402, Taiwan. 3. Biotechnology Center, Agricultural Biotechnology Center, Institute of Molecular Biology, National Chung Hsing University, Taichung 402, Taiwan. 4. Institute of Genomics and Bioinformatics, National Chung Hsing University, Taichung 402, Taiwan.
Abstract
BACKGROUND: Hormone receptors of breast cancer, such as estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (Her-2), are important prognostic factors for breast cancer. OBJECTIVE: The current study aimed to develop a method to retrieve the statistics of hormone receptor expression status, documented in pathology reports, given their importance in research for primary and recurrent breast cancer, and quality management of pathology laboratories. METHOD: A two-stage text mining approach via regular expression-based word/phrase matching, was developed to retrieve the data. RESULTS: The method achieved a sensitivity of 98.8%, 98.7% and 98.4% for extraction of ER, PR, and Her-2 results. The hormone expression status from 3679 primary and 44 recurrent breast cancer cases was successfully retrieved with the method. Statistical analysis of these data showed that the recurrent disease had a significantly lower positivity rate for ER (54.5% vs 76.5%, p=0.001278) than primary breast cancer and a higher positivity rate for Her-2 (48.8% vs 16.2%, p=9.79e-8). These results corroborated the previous literature. CONCLUSION: Text mining on pathology reports using the developed method may benefit research of primary and recurrent breast cancer.
BACKGROUND: Hormone receptors of breast cancer, such as estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (Her-2), are important prognostic factors for breast cancer. OBJECTIVE: The current study aimed to develop a method to retrieve the statistics of hormone receptor expression status, documented in pathology reports, given their importance in research for primary and recurrent breast cancer, and quality management of pathology laboratories. METHOD: A two-stage text mining approach via regular expression-based word/phrase matching, was developed to retrieve the data. RESULTS: The method achieved a sensitivity of 98.8%, 98.7% and 98.4% for extraction of ER, PR, and Her-2 results. The hormone expression status from 3679 primary and 44 recurrent breast cancer cases was successfully retrieved with the method. Statistical analysis of these data showed that the recurrent disease had a significantly lower positivity rate for ER (54.5% vs 76.5%, p=0.001278) than primary breast cancer and a higher positivity rate for Her-2 (48.8% vs 16.2%, p=9.79e-8). These results corroborated the previous literature. CONCLUSION: Text mining on pathology reports using the developed method may benefit research of primary and recurrent breast cancer.
Entities:
Keywords:
Breast cancer; Hormone receptor; Primary cancer; Recurrent cancer; Text mining
Authors: Robert W Carlson; Susan J Moench; M Elizabeth H Hammond; Edith A Perez; Harold J Burstein; D Craig Allred; Charles L Vogel; Lori J Goldstein; George Somlo; William J Gradishar; Clifford A Hudis; Mohammad Jahanzeb; Azadeh Stark; Antonio C Wolff; Michael F Press; Eric P Winer; Soonmyung Paik; Britt-Marie Ljung Journal: J Natl Compr Canc Netw Date: 2006-07 Impact factor: 11.908
Authors: M J Ellis; A Coop; B Singh; L Mauriac; A Llombert-Cussac; F Jänicke; W R Miller; D B Evans; M Dugan; C Brady; E Quebe-Fehling; M Borgs Journal: J Clin Oncol Date: 2001-09-15 Impact factor: 44.544
Authors: Christos Sotiriou; Soek-Ying Neo; Lisa M McShane; Edward L Korn; Philip M Long; Amir Jazaeri; Philippe Martiat; Steve B Fox; Adrian L Harris; Edison T Liu Journal: Proc Natl Acad Sci U S A Date: 2003-08-13 Impact factor: 11.205
Authors: Paul L Nguyen; Alphonse G Taghian; Matthew S Katz; Andrzej Niemierko; Rita F Abi Raad; Whitney L Boon; Jennifer R Bellon; Julia S Wong; Barbara L Smith; Jay R Harris Journal: J Clin Oncol Date: 2008-04-14 Impact factor: 44.544
Authors: Okechinyere J Achilonu; Elvira Singh; Gideon Nimako; René M J C Eijkemans; Eustasius Musenge Journal: Biomed Res Int Date: 2022-01-20 Impact factor: 3.411