| Literature DB >> 26547235 |
Manuel Corpas1, Willy Valdivia-Granda2, Nazareth Torres3, Bastian Greshake4, Alain Coletta5,6, Alexej Knaus7, Andrew P Harrison8, Mike Cariaso9, Federico Moran10, Fiona Nielsen11, Daniel Swan12,13, David Y Weiss Solís14,15, Peter Krawitz16,17, Frank Schacherer18, Peter Schols19, Huangming Yang20, Pascal Borry21, Gustavo Glusman22, Peter N Robinson23,24.
Abstract
BACKGROUND: We describe the pioneering experience of a Spanish family pursuing the goal of understanding their own personal genetic data to the fullest possible extent using Direct to Consumer (DTC) tests. With full informed consent from the Corpas family, all genotype, exome and metagenome data from members of this family, are publicly available under a public domain Creative Commons 0 (CC0) license waiver. All scientists or companies analysing these data ("the Corpasome") were invited to return results to the family.Entities:
Mesh:
Year: 2015 PMID: 26547235 PMCID: PMC4636840 DOI: 10.1186/s12864-015-1973-7
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Family tree of the five individuals and the DTC tests carried out using crowdfunding and private funds. Females are represented as circles and males as squares. The genealogic tree represents the relations among the family members
Fig. 2Distribution of coverage for all non-concordant SNPs between exome calls and 23andMe chips
Fig. 3Distribution of red hair among contemporary European populations. The arrow (bottom left) indicates the origin of the Corpas family. [Image: James McInerney; public domain]
Fig. 4Genotype frequency of SNPs with significant differences between the Corpas family (purple) and a Polish reference population (green). See Methods for details on how the reference population was obtained. a rs122203592; b rs7183877; c rs2402130; d rs12913832; e rs11635884; f rs916977; g rs1800407; h rs11547464; i rs1015362
Fig. 5Score plots generated from the principal component analysis of Corpas family genotypes for hair colour. Red and grey shaded areas show red and dark-brown hair respectively
Fig. 6Comparison of most significant exome results from the crowdsourced analyses of the Corpas family quartet by four different platforms: Genome Trax, Ingenuity Variant Analysis, Diploid and GeneTalk. The different predicted phenotypes and their evidence are represented as present (red highlight) or absent (blank) for each of the family individuals. We find that there is no overlap among reported top results from the four companies
Fig. 7Metagenomics analysis of DNA in faeces of Son. The taxonomic composition based on DNA matching genomic signatures and motif fingerprints of different bacterial and phage genomes is shown
Summary of the crowdsourced data with its metadata. The dataset includes a total of 27 files and 18,463 MB of data
| Type | Provider | Source | Platform | Format | Year obtained | # files | Individuals(s) | Approx. size (MB) |
|---|---|---|---|---|---|---|---|---|
| Genotype | 23andMe | Saliva | SNP chip v2 | bed.zip | 2009 | 1 | Son | 5 |
| Genotype | 23andMe | Saliva | SNP chip v3 | bed.zip | 2011 | 1 | Mother | 8 |
| Genotype | 23andMe | Saliva | SNP chip v3 | bed.zip | 2011 | 1 | Father | 8 |
| Genotype | 23andMe | Saliva | SNP chip v3 | bed.zip | 2011 | 1 | Daughter | 8 |
| Genotype | 23andMe | Saliva | SNP chip v3 | bed.zip | 2011 | 1 | Aunt | 8 |
| Annotation | SNPedia | SNP chip v2 | Phenotypes | txt | 2012 | 1 | Son | 0.3 |
| Annotation | SNPedia | SNP chip v3 | Phenotypes | txt | 2012 | 1 | Mother | 0.4 |
| Annotation | SNPedia | SNP chip v3 | Phenotypes | txt | 2012 | 1 | Father | 0.4 |
| Annotation | SNPedia | SNP chip v3 | Phenotypes | txt | 2012 | 1 | Daughter | 0.4 |
| Annotation | SNPedia | SNP chip v3 | Phenotypes | txt | 2012 | 1 | Aunt | 0.4 |
| Whole Exome Sequencing | BGI | Saliva | SE Illumina HiSeq 2000 | fastq.gz | 2011 | 4 | Son | 2400 |
| Whole Exome Sequencing | BGI | Saliva | PE Illumina HiSeq 2000 | fastq.gz | 2013 | 2 | Mother | 2000 |
| Whole Exome Sequencing | BGI | Saliva | PE Illumina HiSeq 2000 | fastq.gz | 2013 | 2 | Father | 2000 |
| Whole Exome Sequencing | BGI | Saliva | PE Illumina HiSeq 2000 | fastq.gz | 2013 | 2 | Sister | 2400 |
| Alignment | InSilicoDB | SE Illumina HiSeq 2000 | BWA | bam | 2013 | 1 | Son | 2900 |
| Alignment | InSilicoDB | PE Illumina HiSeq 2000 | BWA | bam | 2013 | 1 | Mother | 1800 |
| Alignment | InSilicoDB | PE Illumina HiSeq 2000 | BWA | bam | 2013 | 1 | Father | 1600 |
| Alignment | InSilicoDB | PE Illumina HiSeq 2000 | BWA | bam | 2013 | 1 | Daughter | 2200 |
| Annotation | InSilicoDB | Bam from Son, Mother, Father, Daughter | GATK | VCF | 2013 | 1 | Son, Mother, Father, Daughter | 124.6 |
| Metagenomics | BGI | Fecal frozen sample | PE Illumina HiSeq 2000 | fastq.gz | 2013 | 2 | Son | 1000 |
| Total | 27 | 18463.5 |