| Literature DB >> 24069237 |
Alexander H Schmidt1, Ute V Solloch, Julia Pingel, Jürgen Sauter, Irina Böhme, Nezih Cereb, Kinga Dubicka, Stephan Schumacher, Jacek Wachowiak, Gerhard Ehninger.
Abstract
Regional HLA frequency differences are of potential relevance for the optimization of stem cell donor recruitment. We analyzed a very large sample (n = 123,749) of registered Polish stem cell donors. Donor figures by 1-digit postal code regions ranged from n = 5,243 (region 9) to n = 19,661 (region 8). Simulations based on region-specific haplotype frequencies showed that donor recruitment in regions 0, 2, 3 and 4 (mainly located in the south-eastern part of Poland) resulted in an above-average increase of matching probabilities for Polish patients. Regions 1, 7, 8, 9 (mainly located in the northern part of Poland) showed an opposite behavior. However, HLA frequency differences between regions were generally small. A strong indication for regionally focused donor recruitment efforts can, therefore, not be derived from our analyses. Results of haplotype frequency estimations showed sample size effects even for sizes between n≈5,000 and n≈20,000. This observation deserves further attention as most published haplotype frequency estimations are based on much smaller samples.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24069237 PMCID: PMC3772002 DOI: 10.1371/journal.pone.0073835
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 11-digit postal code regions in Poland (© www.bacher.de; © GfK-Geomarketing, Bruchsal, Germany).
Regional distribution of study donors (n = 123,749) and population figures of Polish 1-digit postal code regions as of June 30, 2010.
| 1-digit postal code region | Number of donors | Population | Fraction of population (%) |
| 0 (including Warsaw) | 18,457 | 4,485,001 | 0.41 |
| 1 (including Olsztyn) | 7,560 | 2,464,869 | 0.31 |
| 2 (including Lublin) | 9,125 | 4,075,396 | 0.22 |
| 3 (including Krakow) | 15,852 | 5,558,182 | 0.29 |
| 4 (including Katowice) | 19,030 | 5,517,253 | 0.34 |
| 5 (including Wroclaw) | 9,182 | 2,790,326 | 0.33 |
| 6 (including Poznan) | 11,521 | 4,424,734 | 0.26 |
| 7 (including Szczecin) | 8,118 | 2,064,237 | 0.39 |
| 8 (including Gdansk) | 19,661 | 4,173,213 | 0.47 |
| 9 (including Lodz) | 5,243 | 2,633,649 | 0.20 |
© GfK-Geomarketing, Bruchsal, Germany.
Figure 2Cumulated HF by Polish 1-digit postal code regions.
Light green, region 0; dark green, 1; yellow, 2; orange, 3; red, 4; violet, 5; turquoise, 6; light blue, 7; dark blue, 8; black, 9. HF were estimated from the regional sub-files of the data set including n = 123,749 donors (see Table 1 for sub-file sizes, Figure 2a), from the regional sub-files of the data set including n = 20,653 donors ([12], Figure 2b), and from regional random samples (see Methods section, Figure 2c). For regions 4, 6 and 9, the points of the cumulated HF curves were calculated as mean values of the respective samples A–C.
Spearman’s rank correlation coefficients ρ between regional sub-files sizes and cumulated HF and corresponding p values.
|
|
|
| ||||
| # of haplotypes | Spearman’s |
| Spearman’s |
| Spearman’s |
|
| 10 | −0.406 | 0.233 | −0.261 | 0.448 | 0.006 | 0.973 |
| 100 | −0.576 | 0.081 | −0.879 | 0.001 | −0.406 | 0.233 |
| 1000 | −0.636 | 0.052 | −0.967 | <10−4 | −0.188 | 0.595 |
For n = 50,000 (corresponding to ten regional sub-files with size n = 5,000), correlations between the regional sub-file sizes of the total data set (n = 123,749) and the cumulated HF of the regional random samples are displayed.
HF = haplotype frequency.
Overview on various parameters related to the suitability of Polish 1-digit postal code regions for further stem cell donor recruitment efforts.
| Cumulated HF | Genetic distances | Matching probabilities | ||||||||||||||||||||
| 10 HT | 100 HT | 1000 HT | Polish-German | Polish-Polish |
|
|
|
|
|
| ||||||||||||
| Region |
| R |
| R |
| R |
| R |
| R |
| R |
| R |
| R |
| R |
| R |
| R |
| 0 | 0.176 | 2 | 0.458 | 4 | 0.838 | 5 | 0.239 | 2 | 0.100 | 4 | 0.577 | 1 | 0.600 | 2 | 0.615 | 2 | 0.626 | 2 | 0.649 | 3 | 0.670 | 3 |
| 1 | 0.180 | 5 | 0.465 | 9 | 0.846 | 7 | 0.232 | 5 | 0.112 | 9 | 0.576 | 7 | 0.597 | 8 | 0.609 | 10 | 0.619 | 10 | 0.639 | 10 | 0.658 | 10 |
| 2 | 0.181 | 7 | 0.463 | 8 | 0.851 | 10 | 0.246 | 1 | 0.101 | 7 | 0.577 | 3 | 0.599 | 4 | 0.614 | 3 | 0.625 | 3 | 0.647 | 4 | 0.669 | 4 |
| 3 | 0.181 | 6 | 0.455 | 3 | 0.850 | 9 | 0.233 | 3 | 0.115 | 10 | 0.577 | 2 | 0.601 | 1 | 0.616 | 1 | 0.628 | 1 | 0.652 | 1 | 0.674 | 1 |
| 4 | 0.184 | 8 | 0.463 | 6 | 0.836 | 4 | 0.220 | 9 | 0.099 | 2 | 0.577 | 4 | 0.599 | 3 | 0.613 | 5 | 0.625 | 4 | 0.651 | 2 | 0.673 | 2 |
| 5 | 0.166 | 1 | 0.448 | 1 | 0.822 | 1 | 0.228 | 6 | 0.099 | 3 | 0.576 | 8 | 0.598 | 6 | 0.611 | 6 | 0.622 | 6 | 0.645 | 6 | 0.666 | 6 |
| 6 | 0.178 | 3 | 0.455 | 2 | 0.825 | 2 | 0.223 | 8 | 0.095 | 1 | 0.577 | 5 | 0.598 | 5 | 0.613 | 4 | 0.624 | 5 | 0.645 | 5 | 0.668 | 5 |
| 7 | 0.179 | 4 | 0.460 | 5 | 0.831 | 3 | 0.233 | 4 | 0.101 | 6 | 0.575 | 10 | 0.596 | 9 | 0.610 | 9 | 0.620 | 9 | 0.641 | 8.5 | 0.661 | 8 |
| 8 | 0.186 | 9 | 0.463 | 7 | 0.843 | 6 | 0.212 | 10 | 0.107 | 8 | 0.576 | 9 | 0.596 | 10 | 0.610 | 8 | 0.620 | 8 | 0.643 | 7 | 0.664 | 7 |
| 9 | 0.187 | 10 | 0.467 | 10 | 0.849 | 8 | 0.227 | 7 | 0.101 | 5 | 0.576 | 6 | 0.597 | 7 | 0.611 | 7 | 0.622 | 7 | 0.641 | 8.5 | 0.661 | 9 |
HT = haplotype, n = total donor file size (including donors already registered at the starting point of the simulation), f = cumulated haplotype frequency, d = genetic distance, R = rank. Small ranks suggest high suitability for donor recruitment. Examples: Region 2 has rank 1 concerning Polish-German genetic distances as it has the highest distance to the German population. Region 6 has rank 1 concerning Polish-Polish genetic distances as it has the lowest mean distance to the other Polish regions.
Figure 3HF of the 20 most frequent haplotypes of the total donor file (n = 123,749) in 100 random samples (n = 5,000) of this file (displayed are mean values and 95% confidence intervals for each haplotype; mean values and corresponding haplotypes are given in Supporting Information S1), in 10 regional sub-files (n = 5,000) and in a German sample (n = 5,000).
10 most frequent HLA-A, -B, -C, -DRB1 haplotpyes for Polish 1-digit postal code regions 0–9.
| Region 0 | Region 1 | Region 2 | Region 3 | |||||
| Frequency rank | Haplotype | Frequency | Haplotype | Frequency | Haplotype | Frequency | Haplotype | Frequency |
| 1 | 01∶01g-08∶01g-07∶01g-03∶01 | 0.0618 | 01∶01g-08∶01g-07∶01g-03∶01 | 0.0517 | 01∶01g-08∶01g-07∶01g-03∶01 | 0.0603 | 01∶01g-08∶01g-07∶01g-03∶01 | 0.0617 |
| 2 | 03∶01g-07∶02g-07∶02g-15∶01 | 0.0219 | 03∶01g-07∶02g-07∶02g-15∶01 | 0.0251 | 03∶01g-07∶02g-07∶02g-15∶01 | 0.0293 | 03∶01g-07∶02g-07∶02g-15∶01 | 0.0235 |
| 3 | 02∶01g-13∶02g-06∶02g-07∶01 | 0.0167 | 02∶01g-13∶02g-06∶02g-07∶01 | 0.0225 | 02∶01g-13∶02g-06∶02g-07∶01 | 0.0164 | 25∶01g-18∶01g-12∶03g-15∶01 | 0.0146 |
| 4 | 25∶01g-18∶01g-12∶03g-15∶01 | 0.0124 | 25∶01g-18∶01g-12∶03g-15∶01 | 0.0135 | 25∶01g-18∶01g-12∶03g-15∶01 | 0.0147 | 02∶01g-13∶02g-06∶02g-07∶01 | 0.0125 |
| 5 | 02∶01g-07∶02g-07∶02g-15∶01 | 0.0123 | 02∶01g-07∶02g-07∶02g-15∶01 | 0.0134 | 02∶01g-07∶02g-07∶02g-15∶01 | 0.0124 | 23∶01g-44∶03-04∶01g-07∶01 | 0.0119 |
| 6 | 03∶01g-35∶01g-04∶01g-01∶01 | 0.0112 | 30∶01g-13∶02g-06∶02g-07∶01 | 0.0109 | 02∶01g-27∶02-02∶02g-16∶01 | 0.0115 | 02∶01g-07∶02g-07∶02g-15∶01 | 0.0113 |
| 7 | 23∶01g-44∶03-04∶01g-07∶01 | 0.0098 | 03∶01g-35∶01g-04∶01g-01∶01 | 0.0109 | 03∶01g-35∶01g-04∶01g-01∶01 | 0.0094 | 03∶01g-35∶01g-04∶01g-01∶01 | 0.0111 |
| 8 | 30∶01g-13∶02g-06∶02g-07∶01 | 0.0091 | 24∶02g-13∶02g-06∶02g-07∶01 | 0.0101 | 24∶02g-07∶02g-07∶02g-15∶01 | 0.0094 | 02∶01g-27∶02-02∶02g-16∶01 | 0.0091 |
| 9 | 02∶01g-57∶01g-06∶02g-07∶01 | 0.0087 | 11∶01g-35∶01g-04∶01g-01∶01 | 0.0100 | 02∶01g-57∶01g-06∶02g-07∶01 | 0.0087 | 01∶01g-57∶01g-06∶02g-07∶01 | 0.0085 |
| 10 | 24∶02g-13∶02g-06∶02g-07∶01 | 0.0085 | 02∶01g-27∶02-02∶02g-16∶01 | 0.0092 | 23∶01g-44∶03-04∶01g-07∶01 | 0.0086 | 02∶01g-44∶02g-05∶01g-04∶01 | 0.0084 |
The HF displayed were estimated from the regional sub-sets of the complete data set (n = 123,749).
Figure 4Simulated matching probabilities by donor file size for various populations of newly recruited donors.
Light green, Polish 1-digit postal code region 0; dark green, 1; yellow, 2; orange, 3; red, 4; violet, 5; turquoise, 6; light blue, 7; dark blue, 8; black, 9; grey, Germany. Donor file sizes range to n = 2,500,000 (Figure 4a) and n = 1,700,000 (Figure 4b), respectively.