| Literature DB >> 21106486 |
Herbert Pang1, Keita Ebisu, Emi Watanabe, Laura Y Sue, Tiejun Tong.
Abstract
Breast cancer tumours among African Americans are usually more aggressive than those found in Caucasian populations. African-American patients with breast cancer also have higher mortality rates than Caucasian women. A better understanding of the disease aetiology of these breast cancers can help to improve and develop new methods for cancer prevention, diagnosis and treatment. The main goal of this project was to identify genes that help differentiate between oestrogen receptor-positive and -negative samples among a small group of African-American patients with breast cancer. Breast cancer microarrays from one of the largest genomic consortiums were analysed using 13 African-American and 201 Caucasian samples with oestrogen receptor status. We used a shrinkage-based classification method to identify genes that were informative in discriminating between oestrogen receptor-positive and -negative samples. Subset analysis and permutation were performed to obtain a set of genes unique to the African-American population. We identified a set of 156 probe sets, which gave a misclassification rate of 0.16 in distinguishing between oestrogen receptor-positive and -negative patients. The biological relevance of our findings was explored through literature-mining techniques and pathway mapping. An independent dataset was used to validate our findings and we found that the top ten genes mapped onto this dataset gave a misclassification rate of 0.15. The described method allows us best to utilise the information available from small sample size microarray data in the context of ethnic minorities.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21106486 PMCID: PMC3042882 DOI: 10.1186/1479-7364-5-1-5
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
Number of cancer microarrays by gender and ethnicity
| Cancer | Gender | Caucasians | African Americans | American Indians | Asians | Hispanics | Others | Unknown | Total |
|---|---|---|---|---|---|---|---|---|---|
| Breast | Female | 310 | 20 | 6 | 3 | 4 | 4 | 1 | 348 |
| Male | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | |
| Total | 315 | 20 | 6 | 3 | 4 | 4 | 1 | 353 | |
| Colon | Female | 140 | 8 | 2 | 0 | 0 | 0 | 0 | 150 |
| Male | 125 | 11 | 4 | 1 | 0 | 0 | 0 | 141 | |
| Unknown | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | |
| Total | 266 | 19 | 6 | 1 | 0 | 0 | 0 | 292 | |
| Kidney | Female | 99 | 4 | 4 | 0 | 2 | 0 | 0 | 109 |
| Male | 157 | 5 | 3 | 1 | 6 | 0 | 0 | 172 | |
| Total | 256 | 9 | 7 | 1 | 8 | 0 | 0 | 281 | |
| Lung | Female | 51 | 1 | 1 | 1 | 0 | 0 | 1 | 55 |
| Male | 74 | 0 | 1 | 1 | 1 | 0 | 0 | 77 | |
| Total | 125 | 1 | 2 | 2 | 1 | 0 | 1 | 132 | |
| Ovary | Female | 188 | 5 | 0 | 4 | 1 | 0 | 1 | 199 |
| Uterus | Female | 129 | 3 | 1 | 1 | 1 | 0 | 0 | 135 |
The percentages of tumour samples from African-American women were 5.7 per cent, 5.3 per cent, 3.7 per cent, 1.8 per cent, 2.5 per cent and 2.2 per cent for breast, colon, kidney, lung, ovary and uterus, respectively, while the percentages of tumour samples from Caucasian women were 89.1 per cent, 93.3 per cent, 90.8 per cent, 92.7 per cent, 94.5 per cent and 95.6 per cent for breast, colon, kidney, lung, ovary and uterus, respectively
Misclassification rates from cross-validation for four methods and different number of top probe sets chosen
| Top 10 | Top 20 | |
|---|---|---|
| Diagonal linear discriminant analysis | 0.3810 | 0.4643 |
| Support vector machine | 0.3690 | 0.4643 |
| k-nearest neighbour (k = 3) | 0.3690 | 0.5238 |
| Regularised shrinkage-based diagonal discriminant analysis | 0.3214 | 0.4881 |
Sixty-two probes with mapped gene symbols (out of 156 probe sets) and counts of occurrences in the top ten and top 156 using Caucasian dataset by resampling 100 times
| Common targets | Top 10 | Top 156 | |||||
|---|---|---|---|---|---|---|---|
| 0 | 5 | ||||||
| 0 | 4 | ||||||
| 0 | 3 | ||||||
| 0 | 2 | ||||||
| 0 | 1 | ||||||
| 0 | 1 | ||||||
| 0 | 1 | ||||||
| 0 | 1 | ||||||
| 0 | 1 | ||||||
| 0 | 1 | ||||||
| 0 | 1 | ||||||
| The genes below all have zero counts (ie genes unique to African Americans) | |||||||
| 0 | 0 | ||||||
| 0 | 0 | ||||||
| ( | ( | 0 | 0 | ||||
| ( | 0 | 0 | |||||
| 0 | 0 | ||||||
| 0 | 0 | ||||||
| ( | 0 | 0 | |||||
| ( | 0 | 0 | |||||
| ( | 0 | 0 | |||||
| ( | 0 | 0 | |||||
| ( | 0 | 0 | |||||
Literature evidence for identified genes; counts represent number of literature citations with gene symbols mentioned
| Common targets | Breast cancer | Oestrogen receptor |
|---|---|---|
| 11 | 0 | |
| 0 | 0 | |
| 1 | 0 | |
| 1 | 0 | |
| 1 | 0 | |
| 0 | 1 | |
| 1 | 0 | |
| 0 | 0 | |
| 0 | 0 | |
| 0 | 0 | |
| 0 | 0 | |
| 1 | 0 | |
| 3 | 1 | |
| 1 | 0 | |
| 0 | 1 | |
| 0 | 0 | |
| 1 | 1 | |
*Genes-to-Systems Breast Cancer (G2SBC) database
Gene symbols of the top ten probe sets that gave a 0.15 error rate in validation dataset
| ( |