| Literature DB >> 28302061 |
Daniel Fischer1, Mervi Honkatukia2, Maria Tuiskula-Haavisto2, Klaus Nordhausen3,4, David Cavero5, Rudolf Preisinger5, Johanna Vilkki2.
Abstract
BACKGROUND: The current gold standard in dimension reduction methods for high-throughput genotype data is the Principle Component Analysis (PCA). The presence of PCA is so dominant, that other methods usually cannot be found in the analyst's toolbox and hence are only rarely applied.Entities:
Keywords: Classification; Dimension reduction; Genotype data; ICS; PCA
Mesh:
Year: 2017 PMID: 28302061 PMCID: PMC5356247 DOI: 10.1186/s12859-017-1589-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Cluster labels of the k-means clustering for mixed data (left), the first two principle components (middle) and the last two ICS components (right). The true class labels are colored accordingly and the k-means classification is represented with different symbols
Fig. 2Scatterplot matrix of the PCA analysis. No particular subgroup could be identified. The first component detects only two outlying observations
Fig. 3Scatterplot matrix of the ICS analysis. Clear subgroups could be identified in component 167 and 168. All members of the subgroup 167 have the same father, but two different mothers, indicated by red(n=19) and green (n=1). Another subgroup could be identified in component 168 (blue, n=10)
Fig. 4Boxplot of production values P2 (left) and P3(right). A clear directional relationship between the subpopulation and the three distance groups close, medium and far. In both production periods have chickens that are in the identified region closer to the subpopulation also higher production values